SAS Code that works like Excel's "VLOOKUP" function - excel

I'm looking for a SAS Code that works just like "VLOOKUP" function in Excel.
I have two tables:
table_1 has an ID column with some other columns in it with 10 rows. Table_2 has two columns: ID and Definition with 50 rows. I want to define a new variable "Definition " in table_1 and lookup the ID values from table_2.
I haven't really tried anything other than merge. but merge keeps all the extra 40 variables from table_2 and that's not what I like.
Thanks, SE

The simplest way is to use the keep option on your merge statement.
data result;
merge table_1 (in=a) table_2 (in=b keep=id definition);
by id;
if a;
run;
An alternative that means you don't have to sort your datasets is to use proc sql.
proc sql;
create table result as
select a.*,
b.definition
from table_1 a
left join table_2 b on a.id = b.id;
quit;
Finally, there is the hash table option if table_2 is small:
data result;
if _n_ = 1 then do;
declare hash b(dataset:'table_2');
b.definekey('id');
b.definedata('definition');
b.definedone();
call missing(definition);
end;
set table_1;
b.find();
run;

Here is one very useful (and often very fast) method specifically for 1:1 matching, which is what VLOOKUP does. You create a Format or Informat with the match-variable and the lookup-result, and put or input the match-variable in the master table.
data class_income;
set sashelp.class(keep=name);
income = ceil(12*ranuni(7));
run;
data for_format;
set class_income end=eof;
retain fmtname 'INCOMEI';
start=name;
label=income;
type='i'; *i=informat numeric, j=informat character, n=format numeric, c=format character;
output;
if eof then do;
hlo='o'; *hlo contains some flags, o means OTHER for nonmatching records;
start=' ';
label=.;
output;
end;
run;
proc format cntlin=for_format;
quit;
data class;
set sashelp.class;
income = input(name,INCOMEI.);
run;

Related

Azure Data Factory - Executing Mathematical Operation from column value

I am new to Azure Data Factory, and I have searched everywhere for a solution that may be implemented for my necessity, but I haven't found any.
My Problem:
I have a table in Azure Database with a column containing a mathematical operation, about 50 columns containing the variables for the operation and one last column where I need to update the result of the mathematical operation, like this:
Example of the table
What I want to do is to fill up the column "result" with the result of the mathematical operation, contained in the column "Operation", using the other columns values in the expression. This is just an example table, my actual table has about 50 columns of values, so it is not a solution for me to use a "replace" operation.
There are probably a few ways to do this but I would not use Data Factory, unless you need to orchestrate this activity as part of a wider pipeline. As you have some compute handy via Azure SQL Database, I would make best use of that unless you have a specific reason not to do so. T-SQL has dynamic SQL and the EXEC command to help. Use a cursor to run through the distinct list of formulas and execute it dynamically. A simplified example:
DROP TABLE IF EXISTS dbo.formulas;
CREATE TABLE dbo.formulas (
Id INT PRIMARY KEY,
formula VARCHAR(100) NOT NULL,
a INT NOT NULL,
b INT NOT NULL,
c INT NOT NULL,
d INT NOT NULL,
e INT NOT NULL,
--...
result INT
);
-- Set up test data
INSERT INTO dbo.formulas ( Id, formula, a, b, c, d, e )
VALUES
( 1, '(a+b)/d', 1, 20, 2, 3, 1 ),
( 2, '(c+b)*(a+e)', 0, 1, 2, 3, 4 ),
( 3, 'a*(d+e+c)', 7, 10, 6, 2, 1 )
SET NOCOUNT ON
-- Create local fast_forward ( forward-only, read-only ) cursor
-- Get the distinct formulas for the table
DECLARE formulaCursor CURSOR FAST_FORWARD LOCAL FOR
SELECT DISTINCT formula
FROM dbo.formulas
-- Cursor variables
DECLARE #sql NVARCHAR(MAX)
DECLARE #formula NVARCHAR(100)
OPEN formulaCursor
FETCH NEXT FROM formulaCursor INTO #formula
WHILE ##fetch_status = 0
BEGIN
SET #sql = 'UPDATE dbo.formulas
SET result = ' + #formula + '
--OUTPUT inserted.id -- optionally output updated ids
WHERE formula = ''' + #formula + ''';'
PRINT #sql
-- Update each result field for the current formula
EXEC(#sql)
FETCH NEXT FROM formulaCursor INTO #formula
END
CLOSE formulaCursor
DEALLOCATE formulaCursor
GO
SET NOCOUNT OFF
GO
-- Check the results
SELECT *
FROM dbo.formulas;
Cursors have a bad reputation for performance but i) here I'm using the distinct list of formulas and ii) sometimes it's the only way. I can't think of a nice set-based way of doing this - happy to be corrected. CLR is not available to you. If performance is a major issue for you you may need to think about alternatives; there's an interesting discussion on a similar problem here.
My results:
If your database was an Azure Synapse Analytics dedicated SQL pool then you could look at Azure Synapse Notebooks to achieve the same outcome.

SAS: Select string with blank at the end

I have a problem while selecting data from an Oracle table. Some values are R, RJ, and so on. However when I run the following query I just get RJ:
proc sql noprint;
SELECT col
FROM myoracletable
WHERE col IN('R','RJ')
;
quit;
So I checked the value R at Oracle:
select distinct rawtohex(col) as col
from myoracletable;
The result for R is 5220. So the string is R[blank]. I modified my SAS program like this:
proc sql noprint;
SELECT col
FROM myoracletable
WHERE col IN('R ','RJ','5220'x)
;
quit;
However the entries with R are still not selected.
How can I solve this issue without trim or compress the string?
SAS uses only fixed length strings and strips trailing blanks to make it work. So it looks like SAS is pushing 'R' instead of 'R ' into the database when it converts your query for you.
You need to write the Oracle query directly instead. So instead of using implicit syntax like:
libname myora oracle ... schema=myoraschema ... ;
proc sql ;
SELECT col
FROM myora.mytable
WHERE col IN('R ','RJ')
;
You should use explicit syntax like this:
libname myora oracle ..... ;
proc sql ;
connect using myora ;
select * from connection to myora
(SELECT col
FROM myoraschema.mytable
WHERE col IN('R ','RJ')
)
;
The long term solution is to fix the Oracle table to NOT store trailing blanks. You might need to redefine the variable as VARCHAR(2) instead of CHAR(2).
I wasn't able to reproduce this behaviour within SAS itself:
data have;
input mystr :$2.;
cards;
R
RJ
;
run;
proc sql;
select mystr from have where mystr in ('R', 'RJ');
quit;
This selects both values. So one option might be to run an initial query to copy part of the Oracle table into a temporary dataset in your SAS work library, then run another query on that.
You might also get different results if you use a pass-through query to access your Oracle table, but I can't test that.

how can i dynamically pass value to db2 search clause 'like' while fetching result from other table

Can someone help me how can i dynamically pass value to db2 search clause like while fetching result from other table.
I am trying this:
select * from table2 where file_name like '%(select file_name from table1)'
I've even tried CONTACT, using sysibm.sysdummy1 methods but no luck.
maybe, this help;
SELECT *
FROM table2
JOIN table1
ON table2.file_name LIKE CONCAT('%',table1.file_name)
Not having been shown the DDL for the files, nor any sample data and expected results from which a reader could determine if there might not be [other] considerations as implied obstacles, the following variation of the already-offered answer is more liberal in selecting what might be intended by the select * from table2 where file_name like '%(select file_name from table1)' from the OP; i.e. rather than effective predicates of ends-with [or a starts-with] the file-name value, the following achieves an effective predicate of contains the file-name value.
select /* t1.file_name, */ t2.*
from table2 as t2
inner join
table1 as t1
on t2.file_name like '%' concat rtrim(t1.file_name) concat '%'

SAS: Macro variable and string. Correct TableName

This is a part of macro:
%let mvTableName = "MyTable";
proc append base = &mvTableName data = TEMP_TABLE;
run;
And i can't find table in WORK :\
After that i check creation of table.
data &mvTableName;
run;
And see in log: Dataset MyTable ...
But when i change string %let mvTableName=MyTable;
I see this log: Dataset WORK.MyTable ..
How it can be explained?
If you are going to use mvTableName as an input for a DATA= option, don't include double quotes
Assuming MyTable and Temp_table are SAS data sets in the WORK library...this should work.
%Let mvTableName=MyTable;
Proc Append base=&mvTableName data=temp_table;
run;
Also,
Data &mvTableName;
Run;
Creates an empty data set...so mvTableName would be overwritten with an empty data set.

Constructing dynamic columns from parameters in Sybase

I'm trying to write a stored proc (SP) in Sybase.
The SP takes 5 varchar parameters.
Based on the parameters passed, I want to construct the column names to be selected from a particular table.
The below works:
DECLARE #TEST VARCHAR(50)
SELECT #TEST = "country"
--print #TEST
execute("SELECT DISTINCT id_country AS id_level, Country AS nm_level
FROM tempdb..tbl_books INNER JOIN
(tbl_ch2_bespoke_report INNER JOIN tbl_ch2_bespoke_rpt_mapping
ON tbl_ch2_bespoke_report.id_report = tbl_ch2_bespoke_rpt_mapping.id_report)
ON id_" + #TEST + "= tbl_ch2_bespoke_rpt_mapping.id_pnl_level
WHERE tbl_ch2_bespoke_report.id_report = 14")
but gives me multiple results:
1 1 row(s) affected.
id_level nm_level
1 4376 XYZ
2 4340 ABC
I would like to however only obtain the 2nd result.
Do I need to necessarily use dynamic SQL to achieve this?
Many thanks for your help.
--Chapax
If I'm understanding you correctly, you'd like to eliminate the "1 row(s) affected." line. If so, the "set nocount on/off" option should do the trick:
declare #something int
declare #query varchar(2000)
set nocount on
select #something=30
select #query = "SELECT * FROM a_table where id_row = " + convert(varchar(10),#something)
set nocount off
exec (#query)
or
declare #something int
declare #query varchar(2000)
set nocount on
select #something=30
set nocount off
SELECT * FROM a_table where id_row = #something
SET NOCOUNT {ON|OFF} to turn off row count messages.
Yes, you need to you dynamic SQL to change the structure or content of the result set (either the column list or the WHERE clause).

Resources