Desired Outcome:
High level introduction to SAS using basic overview from SASCrunch.com with various Lessons Learned:
Accessing data from a file
-Within the code, the first line in tutorial is:
Filename file1 ‘folders/myfolders/file1.csv’
-I was able to easily create the CSV, but within the Mac it was in the myfolders directory, and I didn’t realize that SAS is configured first for ‘folders’
-Within SAS, Code, Server Files and Folders, I first click on select “My Folders”
-Click on the icon above called Folder Properties
-Location: /folders/myfolders
-Note: I was thinking this was just within the tutorial, but this actually is the path
Different formats of files for input
Note: This was an issue because I was getting ahead of the documentation, and didn’t realize that there are different kinds of input file formats (space-delimited, comma-delimited, aligned by columns. I was creating the file in Mac MS-Excel and saving in comma separated values (.CSV), but there are different forms of that within Save-As, and I’m choosing what’s specifically name “comma separated values (CSV)”.
-Because I’m saving as comma delimited, I’ll try:
Data DS;
Infile file1 firstobs=2 dlm=’,’;
Input ID $ Age;
Run;
Issue: Within Output Data, there is only data in first row, ID of: ID0001,2
Note: Properties of the .csv file is that it’s opened within MS-Excel. Maybe the raw .CSV file needs to be saved differently.
Note: Now I’m finding myriad ways of doing this, including right-clicking on the file1.csv (in SAS Studio), selecting “Import Data”, and accepting all the defaults – does everything for me.
Note: I need to dig deeper into differences between file formats, exporting, importing. Will probably be a different blog post.
-This is the code that was auto-generated when I right-clicked on the data file and chose to “Import Data”:
/* Generated Code (IMPORT) */
/* Source File: file1.csv */
/* Source Path: /folders/myfolders */
/* Code generated on: 6/9/19, 9:01 AM */
%web_drop_table(WORK.IMPORT);
FILENAME REFFILE ‘/folders/myfolders/file1.csv’;
PROC IMPORT DATAFILE=REFFILE
DBMS=CSV
OUT=WORK.IMPORT;
GETNAMES=YES;
RUN;
PROC CONTENTS DATA=WORK.IMPORT; RUN;
%web_open_table(WORK.IMPORT);
Note: SAS is using various variables like “DBMS”. This gives me some clues on what’s really required for importing files.
Sorting
Note: This one didn’t quite work as planned. I didn’t see the “flag” after running it. Need to work on this one.
proc sort data=sashelp.class out=class;
by sex height;
run;
data class2;
set class;
by sex height;
if first.sex then flag = “Shortest”;
else if last.sex then flag = “Tallest”;
run;
Statistical Variables
-SAS has a forte, of course, of statistics. Appears easy to run a program to request a bunch of data. The sashelp.cars data set is something that comes with SAS University Edition.
Proc univariate data=sashelp.cars;
Var MSRP;
Run;
-This gives results of mean and standard deviation (stdev)
Proc means data=sashelp.cars;
Var MSRP;
Run;
-This gives results of frequencies, with various tables
Proc Freq data=sashelp.cars;
Table Make * DriveTrain;
Run;
Proc SQL – Retrieving data from dataset and manipulating it
-This is the primary reason I’m working with SAS University Edition – to learn and practice SQL with various relational databases.
-Example of some code is:
Proc sql; (SQL commands to retrieve data from data set and do things like sorting)
select make, model, msrp (focus on specific variables)
from sashelp.cars (location of the dataset)
where make = “Audi” (subsetting within a variable for a specific field)
order by msrp; (sorting the results by a specific field)
Quit;
Proc SQL – Summarizing Data
Proc sql; (Using SQL commands within SAS)
select make, (Select a specific field within dataset)
count(make) as n, (Count up the number of entries by make, and return the count)
mean(msrp) as mean_msrp (find the mean within each grouping of make)
from sashelp.cars (pull data from this dataset)
group by make; (group together by make)
Quit;
Proc SQL – Creating table (data set)
Proc sql; (SQL commands within SQL)
create table bigfish as (will create a table and call it bigfish)
select * (go into dataset and select all fields within the dataset)
from sashelp.fish (specific dataset)
where weight >1000; (only pull data over a certain weight)
Quit;
SAS Macro
-I’m going to hold off on this for the moment. Need to get more comfortable with the manual stuff before automating with scripts.