The Statistics Canada data files used in SPIDA are presented in SAS format. Accordingly, we will use this program for the workshop. The day-long session "Data Analysis Using SAS for Windows", June 1 and June 8, 2004, is intended for participants with no previous experience in using SAS, or for those who may not be sure that their current level of understanding of SAS is adequate for the workshop. The following self-evaluation exercise is intended to help participants decide whether to register for sessions or not.
The Longitudinal Job data file (LJ) from the Survey of Labour and Income Dynamics (SLID) is used as an example. The following files containing SAS statements are presented in the order received from Statistics Canada. For the purpose of this exercise, only the first 29 variables for the first 1000 respondents are used.
The files are defined as follows (all in one ZIP file, lj.zip)
‘ljformat.sas’
FORMAT PUPID26c V001_F. EYOB26c V002_F. EAGE26c V003_F. ... YRXFT11b V028_F. YRXFT11c V029_F.;
‘ljlayout.sas’
LIBNAME ???; DATA OUT.???; INFILE LJ LRECL = 102; INPUT @ 1 PUPID26c 7. @ 8 ELGW26c 10.4 @ 18 SSWT26c 10.4 @ 28 EYOB26c 4. ... @ 99 YRXFT11b 2. @ 101 YRXFT11c 2.;
Good luck ……… Happy SASing!
OUTPUT TABLES
[a] Seven Simple Frequency Tabulations
Note: the table titles indicate whether missing values are included or excluded .
SLID.lj9394 - Part Include missing values in the frequencies Sex Cumulative Cumulative SEX21 Frequency Percent Frequency Percent ------------------------------------------------------------ Male 549 54.9 549 54.9 Female 451 45.1 1000 100.0 Marital status refyr (grp) - 1992 Cumulative Cumulative MARST26A Frequency Percent Frequency Percent -------------------------------------------------------------------- Don't Know 11 1.1 11 1.1 Not Applicable 13 1.3 24 2.4 Married 530 53.0 554 55.4 Common-law 80 8.0 634 63.4 Separated 23 2.3 657 65.7 Divorced 29 2.9 686 68.6 Widowed 8 0.8 694 69.4 Single (never married) 306 30.6 1000 100.0 Marital status refyr (grp) - 1994 Cumulative Cumulative MARST26C Frequency Percent Frequency Percent -------------------------------------------------------------------- Don't Know 5 0.5 5 0.5 Married 549 54.9 554 55.4 Common-law 84 8.4 638 63.8 Separated 41 4.1 679 67.9 Divorced 37 3.7 716 71.6 Widowed 6 0.6 722 72.2 Single (never married) 278 27.8 1000 100.0 Immigrant Cumulative Cumulative IMMST15 Frequency Percent Frequency Percent ------------------------------------------------------------ Don't Know 14 1.4 14 1.4 Yes 68 6.8 82 8.2 No 918 91.8 1000 100.0 Age grp at immigration Cumulative Cumulative AGIMMG15 Frequency Percent Frequency Percent ------------------------------------------------------------ Don't Know 19 1.9 19 1.9 Not Applicable 918 91.8 937 93.7 00-09 15 1.5 952 95.2 10-19 16 1.6 968 96.8 20-29 21 2.1 989 98.9 30-39 9 0.9 998 99.8 40-49 1 0.1 999 99.9 50 and older 1 0.1 1000 100.0 Region - 1992 Cumulative Cumulative REGRE25A Frequency Percent Frequency Percent -------------------------------------------------------------- Atlantic 250 25.0 250 25.0 Quebec 177 17.7 427 42.7 Ontario 236 23.6 663 66.3 Prairies 246 24.6 909 90.9 British Columbia 91 9.1 1000 100.0 Region - 1994 Cumulative Cumulative REGRE25C Frequency Percent Frequency Percent -------------------------------------------------------------- Don't Know 7 0.7 7 0.7 Atlantic 235 23.5 242 24.2 Quebec 178 17.8 420 42.0 Ontario 239 23.9 659 65.9 Prairies 241 24.1 900 90.0 British Columbia 100 10.0 1000 100.0
[b] Three Cross-tabulations (unweighted data)
Note: the table titles indicate whether missing values are included or excluded .
SLID.lj9394 - Part Exclude missing values in the cross-tabulations TABLE OF SEX21 BY IMMST15 SEX21(Sex) IMMST15(Immigrant) Frequency | Percent | Row Pct | Col Pct |Yes |No | Total | | | ---------------+--------+--------+ Male | 39 | 500 | 539 | 3.96 | 50.71 | 54.67 | 7.24 | 92.76 | | 57.35 | 54.47 | ---------------+--------+--------+ Female | 29 | 418 | 447 | 2.94 | 42.39 | 45.33 | 6.49 | 93.51 | | 42.65 | 45.53 | ---------------+--------+--------+ Total 68 918 986 6.90 93.10 100.00 Frequency Missing = 14 SLID.lj9394 - Part Exclude missing values in the cross-tabulations TABLE OF MARST26A BY MARST26C MARST26A(Marital status refyr (grp) - 1992) MARST26C(Marital status refyr (grp) - 1994) Frequency Percent Row Pct |Married |Common-l|Separate|Divorced|Widowed |Single !(Total) Col Pct | |aw |d | | |never ma| | | | | | |rried) | -----------------+--------+--------+--------+--------+--------+--------+ Married | 509 | 4 | 14 | 1 | 0 | 0 | 528 | 52.42 | 0.41 | 1.44 | 0.10 | 0.00 | 0.00 | 54.38 | 96.40 | 0.76 | 2.65 | 0.19 | 0.00 | 0.00 | | 93.74 | 4.82 | 35.90 | 2.78 | 0.00 | 0.00 | -----------------+--------+--------+--------+--------+--------+--------+ Common-law | 9 | 60 | 9 | 2 | 0 | 0 | 80 | 0.93 | 6.18 | 0.93 | 0.21 | 0.00 | 0.00 | 8.24 | 11.25 | 75.00 | 11.25 | 2.50 | 0.00 | 0.00 | | 1.66 | 72.29 | 23.08 | 5.56 | 0.00 | 0.00 | -----------------+--------+--------+--------+--------+--------+--------+ Separated | 1 | 0 | 16 | 5 | 0 | 0 | 22 | 0.10 | 0.00 | 1.65 | 0.51 | 0.00 | 0.00 | 2.27 | 4.55 | 0.00 | 72.73 | 22.73 | 0.00 | 0.00 | | 0.18 | 0.00 | 41.03 | 13.89 | 0.00 | 0.00 | -----------------+--------+--------+--------+--------+--------+--------+ Divorced | 1 | 2 | 0 | 26 | 0 | 0 | 29 | 0.10 | 0.21 | 0.00 | 2.68 | 0.00 | 0.00 | 2.99 | 3.45 | 6.90 | 0.00 | 89.66 | 0.00 | 0.00 | | 0.18 | 2.41 | 0.00 | 72.22 | 0.00 | 0.00 | -----------------+--------+--------+--------+--------+--------+--------+ Widowed | 1 | 0 | 0 | 0 | 6 | 0 | 7 | 0.10 | 0.00 | 0.00 | 0.00 | 0.62 | 0.00 | 0.72 | 14.29 | 0.00 | 0.00 | 0.00 | 85.71 | 0.00 | | 0.18 | 0.00 | 0.00 | 0.00 | 100.00 | 0.00 | -----------------+--------+--------+--------+--------+--------+--------+ Single (never ma | 22 | 17 | 0 | 2 | 0 | 264 | 305 rried) | 2.27 | 1.75 | 0.00 | 0.21 | 0.00 | 27.19 | 31.41 | 7.21 | 5.57 | 0.00 | 0.66 | 0.00 | 86.56 | | 4.05 | 20.48 | 0.00 | 5.56 | 0.00 | 100.00 | -----------------+--------+--------+--------+--------+--------+--------+ Total 543 83 39 36 6 264 971 55.92 8.55 4.02 3.71 0.62 27.19 100.00 Frequency Missing = 29 Exclude missing values in the cross-tabulations TABLE OF REGRE25A BY REGRE25C REGRE25A(Region - 1992) REGRE25C(Region - 1994) Frequency | Percent | Row Pct | Col Pct |Atlantic|Quebec |Ontario |Prairies|British | Total | | | | |Columbia| -----------------+--------+--------+--------+--------+--------+ Atlantic | 235 | 2 | 3 | 3 | 3 | 246 | 23.67 | 0.20 | 0.30 | 0.30 | 0.30 | 24.77 | 95.53 | 0.81 | 1.22 | 1.22 | 1.22 | | 100.00 | 1.12 | 1.26 | 1.24 | 3.00 | -----------------+--------+--------+--------+--------+--------+ Quebec | 0 | 176 | 0 | 0 | 1 | 177 | 0.00 | 17.72 | 0.00 | 0.00 | 0.10 | 17.82 | 0.00 | 99.44 | 0.00 | 0.00 | 0.56 | | 0.00 | 98.88 | 0.00 | 0.00 | 1.00 | -----------------+--------+--------+--------+--------+--------+ Ontario | 0 | 0 | 232 | 2 | 2 | 236 | 0.00 | 0.00 | 23.36 | 0.20 | 0.20 | 23.77 | 0.00 | 0.00 | 98.31 | 0.85 | 0.85 | | 0.00 | 0.00 | 97.07 | 0.83 | 2.00 | -----------------+--------+--------+--------+--------+--------+ Prairies | 0 | 0 | 4 | 236 | 3 | 243 | 0.00 | 0.00 | 0.40 | 23.77 | 0.30 | 24.47 | 0.00 | 0.00 | 1.65 | 97.12 | 1.23 | | 0.00 | 0.00 | 1.67 | 97.93 | 3.00 | -----------------+--------+--------+--------+--------+--------+ British Columbia | 0 | 0 | 0 | 0 | 91 | 91 | 0.00 | 0.00 | 0.00 | 0.00 | 9.16 | 9.16 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | | 0.00 | 0.00 | 0.00 | 0.00 | 91.00 | -----------------+--------+--------+--------+--------+--------+ Total 235 178 239 241 100 993 23.67 17.93 24.07 24.27 10.07 100.00 Frequency Missing = 7
[c] Two (2) Cross-tabulations using weighted data
Note: the tables SEX21 BY IMMST15 and REGRE25A BY REGRE25C are presented in weighted form as well (using the ELGW26c variable as the weight).
SLID.lj9394 - Part Exclude missing values in the cross-tabulations WEIGHT BY ELGW26C - longitudinal weight TABLE OF SEX21 BY IMMST15 SEX21(Sex) IMMST15(Immigrant) Frequency | Percent | Row Pct | Col Pct |Yes |No | Total | | | ---------------+--------+--------+ Male | 44755 | 395099 | 439854 | 5.87 | 51.84 | 57.71 | 10.18 | 89.82 | | 57.48 | 57.73 | ---------------+--------+--------+ Female | 33105 | 289259 | 322364 | 4.34 | 37.95 | 42.29 | 10.27 | 89.73 | | 42.52 | 42.27 | ---------------+--------+--------+ Total 77860.2 684358 762218 10.21 89.79 100.00 Frequency Missing = 9266.3419 SLID.lj9394 - Part Exclude missing values in the cross-tabulations WEIGHT BY ELGW26C - longitudinal weight TABLE OF REGRE25A BY REGRE25C REGRE25A(Region - 1992) REGRE25C(Region - 1994) Frequency | Percent | Row Pct | Col Pct |Atlantic|Quebec |Ontario |Prairies|British | Total | | | | |Columbia| -----------------+--------+--------+--------+--------+--------+ Atlantic | 76634 | 601.5 | 429.28 | 2020.9 | 1254.4 | 80940 | 9.97 | 0.08 | 0.06 | 0.26 | 0.16 | 10.53 | 94.68 | 0.74 | 0.53 | 2.50 | 1.55 | | 100.00 | 0.38 | 0.15 | 1.45 | 1.09 | -----------------+--------+--------+--------+--------+--------+ Quebec | 0 | 159612 | 0 | 0 | 812.81 | 160424 | 0.00 | 20.76 | 0.00 | 0.00 | 0.11 | 20.87 | 0.00 | 99.49 | 0.00 | 0.00 | 0.51 | | 0.00 | 99.62 | 0.00 | 0.00 | 0.71 | -----------------+--------+--------+--------+--------+--------+ Ontario | 0 | 0 | 274094 | 1735.7 | 1839.8 | 277670 | 0.00 | 0.00 | 35.65 | 0.23 | 0.24 | 36.12 | 0.00 | 0.00 | 98.71 | 0.63 | 0.66 | | 0.00 | 0.00 | 98.87 | 1.24 | 1.60 | -----------------+--------+--------+--------+--------+--------+ Prairies | 0 | 0 | 2704.4 | 135686 | 3384.2 | 141775 | 0.00 | 0.00 | 0.35 | 17.65 | 0.44 | 18.44 | 0.00 | 0.00 | 1.91 | 95.71 | 2.39 | | 0.00 | 0.00 | 0.98 | 97.31 | 2.94 | -----------------+--------+--------+--------+--------+--------+ British Columbia | 0 | 0 | 0 | 0 | 107931 | 107931 | 0.00 | 0.00 | 0.00 | 0.00 | 14.04 | 14.04 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | | 0.00 | 0.00 | 0.00 | 0.00 | 93.67 | -----------------+--------+--------+--------+--------+--------+ Total 76634.1 160213 277228 139443 115222 768740 9.97 20.84 36.06 18.14 14.99 100.00 Frequency Missing = 2743.9987
[d] Summary Descriptive Statistics (unweighted data)
Note: in the average income table: if the means, standard deviations and maximum values do not agree with the values given below, one of the variables must contain an incorrect missing value specification!
SLID.lj9394 - Part Means for income Variable Label N Nmiss Minimum Maximum Mean Std Dev ---------------------------------------------------------------------------------------- TTINC27B EF-Total money income - 1993 989 11 260.00 220873.00 53947.14 31442.81 TTINC27C EF-Total money income - 1994 984 16 170.00 245190.00 53179.54 31510.56 ATINC27B EF-After-tax income - 1993 989 11 260.00 190866.00 43739.22 23519.87 ATINC27C EF-After-tax income - 1994 984 16 170.00 198336.00 43073.34 23479.36 -----------------------------------------------------------------------------------------