SAS is a statistical software package that was developed at North Carolina State University in the 1960s, and since 1976 has been a commercial product sold by SAS Institute. It is another step up in complexity from SPSS: SAS is somewhat more difficult to use but offers much more in terms of types of analyses available and flexibility in specifying and executing those analyses. The major disadvantage for beginners is that SAS is a syntax-based system and there are so many choices to be made for even a simple analysis that it can overwhelm people who don't have a background in or aptitude for programming. SAS is also less friendly in terms of managing data files and metadata; for instance, it stores formats in files separate from the data file and requires that the format location be specified in the syntax every time the data file is opened (rather than attaching the format information to the data file, as SPSS does). However, SAS has become the standard language in many professional fields, and there is more assistance for learning and using SAS, both from the SAS web page (http://www. sas. com) and help desk and from many published books and web sites, than is available for SPSS.
SAS is similar to SPSS in many ways: it is a comprehensive statistical package that can conduct more types of analyses than can possibly be enumerated here; it can read and write data sets in many different formats; and it is prohibitively expensive for an individual to buy but may be affordable if your school or place of business has a site license. The major difference is that SAS is primarily a syntax-based system, with the exception of JMP, a menu-driven, interactive statistics and graphics package that is now sold by a division of SAS. Many statisticians prefer to work with syntax anyway, partly because they (like me!) are so old they learned to use computers before graphical interfaces were available and partly because (as mentioned in the SPSS section above) syntax may be shared and reused. In addition, writing syntax forces programmers to think through their analysis in a way that can be avoided while clicking on menus. To someone just starting out in statistics, however, the lack of a menu interface may seem more of a barrier than an advantage. This may be somewhat ameliorated by using the time-tested method of taking someone else's code and altering it to fit your needs, and there is so much annotated SAS code available on the Internet that you could probably teach yourself to write SAS programs just by using this method.
SAS has three main windows: the syntax window, where you can type your syntax or copy it in from another text or word processing program; the log window, which contains a record or log of everything done in a particular session, including warnings and other messages from the SAS system; and the output window, where output from statistical procedures is sent by default (it can be directed to other locations, such as an html or *. rtf file, through use of the ODS system). To use SAS, you must open an SAS data set or import another type of data (such as a file stored in Excel or text format), submit commands through the syntax window, and check the output in the output window. The log and syntax windows are illustrated in Figure B-7.
Figure B-7. SAS log and syntax windows
![]()
The syntax window (transprt) illustrates three main features of SAS program-ming. The first is that the location of SAS data files are declared using the libname command and the data files themselves referenced with a two-part name: library. datasetname. In this case we declared the library brfss (the actual name is arbitrary) to exist at the physical location:
C:\Documents and Settings\seb5632\Desktop\2007brfss\
then referenced the data set brfss. brfss2007 that is stored in that location. The second main point is that SAS programs consist primarily of two types of steps:
- DATA steps, which open, manipulate, and save data files.
- PROC steps, which perform statistical analyses on the files.
In this case, our DATA step opened the file brfss. brfss2007, selected cases for Missouri (state code = 29), and stored the selected cases in a new file called brfss. mo. We then created a crosstabulation table for the variables hlthplan and sex using this new data file, selecting only cases with a value of 1 or 2 on the variable hlthplan.
The log window (Log – (Untitled)) echoes the syntax submitted and also contains messages from the SAS system. Messages in the output window tell us that 5,252 cases were used to create the cross-tabulation table, and give us information about processing time and CPU usage.
An excerpt from the output created by this syntax is shown in the SAS output window in Figure B-8.
To the far left of the log, syntax, and output windows are two other windows that may be toggled between by use of the tabs in their lower corners. The Results window (Figure B-9) shows an outline of the results produced during a session: clicking on any folder causes the next greater level of detail to be displayed. The Explorer window (Figures B-10 and B-11) allows access to different SAS libraries (any libraries created by the user, such as y in this case, must have been declared by a libname command during the current SAS session): clicking on the folders moves the display to the next greater level of detail.
Figure B-9. SAS Results window: output from the Frequency procedure is displayed at one level greater of detail than that of the Means or Corr procedure
![]()
Figure B-10. SAS Explorer window
![]()
Figure B-11. Contents of a data library (three SAS data files) from SAS Explorer window
![]()
Note that it is possible to open an SAS data set in spreadsheet form (as in Figure B-12), which SAS calls Viewtable format, by clicking on it in the Explorer window, and that it is possible to enter or edit data directly by this method: normally, however, in SAS these procedures are accomplished using syntax.
Figure B-12. SAS data set in Viewtable format
![]()
Several resources to learn SAS are listed in Appendix C. This is only a beginning: there are many resources available on the Internet to help you learn SAS. Because it is a syntax-based language, examples of annotated code are particularly useful in learning SAS. Good sources of annotated code include the support section of the SAS web site (http://support. sas. com), the UCLA web site (http://www. ats. ucla. edu/stat/sas/modules/), and the Texas A & M web site (http://techdocs. tamu. edu/ Completed/SASUG); many more can be found by searching the Internet. There are many more books published about SAS than about any of the other packages discussed in this appendix, so the trick is not finding a book about SAS but finding the book that meets your needs. Two good books for beginners that include discussions about file types, importing and exporting data, etc. , are Delwiche and Slaughter's The Little SAS Book: A Primer (SAS), and Cody's Learning SAS by Example: A Programmer's Guide (SAS). Cody and Smith's Applied Statistics and the SAS Programming Language (Prentice Hall) focuses more on statistical procedures. There are also many discipline-specific SAS books, such as Walker's Common Statistical Methods for Clinical Research with SAS Examples (SAS).
Need to learn statistics as part of your job, or looking for help to pass a statistics course? Statistics in a Nutshell is a clear and concise introduction and reference for anyone with no previous background in the subject. You get a firm grasp of the basics before moving into increasingly advanced material. Each chapter presents you with easy-to-follow descriptions illustrated by graphics, formulas, and plenty of solved examples.




Help






