Should I register for the SAS course?

The Statistics Canada data files used in SPIDA are presented in SAS format. Accordingly, we will use this program for the workshop. The day-long session "Data Analysis Using SAS for Windows", June 1 and June 8, 2004, is intended for participants with no previous experience in using SAS, or for those who may not be sure that their current level of understanding of SAS is adequate for the workshop. The following self-evaluation exercise is intended to help participants decide whether to register for sessions or not.

The Longitudinal Job data file (LJ) from the Survey of Labour and Income Dynamics (SLID) is used as an example. The following files containing SAS statements are presented in the order received from Statistics Canada. For the purpose of this exercise, only the first 29 variables for the first 1000 respondents are used.

The files are defined as follows (all in one ZIP file, lj.zip)

  1. ljformat.sas
  2. ljlayout.sas
  3. ljmiss.sas
  4. ljvalue.sas
  5. ljvar.sas
  6. lj9394.dta
  1. The first step is to DOWNLOAD these files. The first few … and the last records in ljformat.sas and in ljlayout.sasshould be as follows:

  2. ljformat.sas’

    FORMAT
       PUPID26c V001_F.
       EYOB26c V002_F.
       EAGE26c V003_F.
       ...
       YRXFT11b V028_F.
       YRXFT11c V029_F.;
    

    ljlayout.sas’

    LIBNAME ???;
    DATA OUT.???;
       INFILE LJ LRECL = 102;
       INPUT
       @ 1 PUPID26c 7.
       @ 8 ELGW26c 10.4
       @ 18 SSWT26c 10.4
       @ 28 EYOB26c 4.
       ...
       @ 99 YRXFT11b 2.
       @ 101 YRXFT11c 2.;
    
  3. Using these files, set up and execute a SAS program. [If you do not know that SAS requires a specific order for establishing a program: we shall see you in the course!]
  4. Write the PROC statements (e.g., PROC FREQ, PROC MEANS, etc.) in order to reproduce the tables in the following section.
    1. Frequencies for the variables: sex (SEX21), marital status in 1992 (MARST26A), marital status in 1994 (MARST26C), immigrant status (IMMST15), and so on for other variables, as shown in part [a]
    2. Cross-tabulations of pairs of variables: e.g., sex (SEX21) by immigrant status (IMMST15) as shown in part [b]
    3. Cross tabulations using weighted data, as shown in part [c]
    4. Summary statistics ( N Nmiss Minimum Maximum Mean Std Dev) for some of the variables related to income in 1993 and 1994 as shown in part [d]
  5. If your output tables replicate about 90% of what follows, you need not attend the SAS session, but you are welcome to attend to brush up or seek extra help.

Good luck ……… Happy SASing!

OUTPUT TABLES

[a] Seven Simple Frequency Tabulations

Note: the table titles indicate whether missing values are included or excluded .

SLID.lj9394 - Part
                Include missing values in the frequencies
                             Sex
                                      Cumulative  Cumulative
         SEX21   Frequency   Percent   Frequency    Percent
------------------------------------------------------------
Male                  549      54.9         549       54.9
Female                451      45.1        1000      100.0


                  Marital status refyr (grp) - 1992

                                              Cumulative  Cumulative
              MARST26A   Frequency   Percent   Frequency    Percent
--------------------------------------------------------------------
Don't Know                     11       1.1          11        1.1
Not Applicable                 13       1.3          24        2.4
Married                       530      53.0         554       55.4
Common-law                     80       8.0         634       63.4
Separated                      23       2.3         657       65.7
Divorced                       29       2.9         686       68.6
Widowed                         8       0.8         694       69.4
Single (never married)        306      30.6        1000      100.0


                       Marital status refyr (grp) - 1994

                                              Cumulative  Cumulative
              MARST26C   Frequency   Percent   Frequency    Percent
--------------------------------------------------------------------
Don't Know                      5       0.5           5        0.5
Married                       549      54.9         554       55.4
Common-law                     84       8.4         638       63.8
Separated                      41       4.1         679       67.9
Divorced                       37       3.7         716       71.6
Widowed                         6       0.6         722       72.2
Single (never married)        278      27.8        1000      100.0


                          Immigrant

                                      Cumulative  Cumulative
       IMMST15   Frequency   Percent   Frequency    Percent
------------------------------------------------------------
Don't Know             14       1.4          14        1.4
Yes                    68       6.8          82        8.2
No                    918      91.8        1000      100.0

                 

                      Age grp at immigration

                                      Cumulative  Cumulative
      AGIMMG15   Frequency   Percent   Frequency    Percent
------------------------------------------------------------
Don't Know             19       1.9          19        1.9
Not Applicable        918      91.8         937       93.7
00-09                  15       1.5         952       95.2
10-19                  16       1.6         968       96.8
20-29                  21       2.1         989       98.9
30-39                   9       0.9         998       99.8
40-49                   1       0.1         999       99.9
50 and older            1       0.1        1000      100.0

                         Region - 1992
                                        Cumulative  Cumulative
        REGRE25A   Frequency   Percent   Frequency    Percent
--------------------------------------------------------------
Atlantic                250      25.0         250       25.0
Quebec                  177      17.7         427       42.7
Ontario                 236      23.6         663       66.3
Prairies                246      24.6         909       90.9
British Columbia         91       9.1        1000      100.0

                         Region - 1994 
                                        Cumulative  Cumulative
        REGRE25C   Frequency   Percent   Frequency    Percent
--------------------------------------------------------------
Don't Know                7       0.7           7        0.7
Atlantic                235      23.5         242       24.2
Quebec                  178      17.8         420       42.0
Ontario                 239      23.9         659       65.9
Prairies                241      24.1         900       90.0
British Columbia        100      10.0        1000      100.0

[b] Three Cross-tabulations (unweighted data)

Note: the table titles indicate whether missing values are included or excluded .

SLID.lj9394 - Part   Exclude missing values in the cross-tabulations

TABLE OF SEX21 BY IMMST15
SEX21(Sex)      IMMST15(Immigrant)

Frequency      |
Percent        |
Row Pct        |
Col Pct        |Yes     |No      |  Total
               |        |        |
---------------+--------+--------+
Male           |     39 |    500 |    539
               |   3.96 |  50.71 |  54.67
               |   7.24 |  92.76 |
               |  57.35 |  54.47 |
---------------+--------+--------+
Female         |     29 |    418 |    447
               |   2.94 |  42.39 |  45.33
               |   6.49 |  93.51 |
               |  42.65 |  45.53 |
---------------+--------+--------+
Total                68      918      986
                   6.90    93.10   100.00

Frequency Missing = 14


SLID.lj9394 - Part      Exclude missing values in the cross-tabulations
TABLE OF MARST26A BY MARST26C
MARST26A(Marital status refyr (grp) - 1992)     MARST26C(Marital status refyr (grp) - 1994)

 
Frequency        
Percent
Row Pct          |Married |Common-l|Separate|Divorced|Widowed |Single  !(Total)
Col Pct          |        |aw      |d       |        |        |never ma|
                 |        |        |        |        |        |rried)  |
-----------------+--------+--------+--------+--------+--------+--------+
Married          |    509 |      4 |     14 |      1 |      0 |      0 |    528
                 |  52.42 |   0.41 |   1.44 |   0.10 |   0.00 |   0.00 |  54.38
                 |  96.40 |   0.76 |   2.65 |   0.19 |   0.00 |   0.00 |
                 |  93.74 |   4.82 |  35.90 |   2.78 |   0.00 |   0.00 |
-----------------+--------+--------+--------+--------+--------+--------+
Common-law       |      9 |     60 |      9 |      2 |      0 |      0 |     80
                 |   0.93 |   6.18 |   0.93 |   0.21 |   0.00 |   0.00 |   8.24
                 |  11.25 |  75.00 |  11.25 |   2.50 |   0.00 |   0.00 |
                 |   1.66 |  72.29 |  23.08 |   5.56 |   0.00 |   0.00 |
-----------------+--------+--------+--------+--------+--------+--------+
Separated        |      1 |      0 |     16 |      5 |      0 |      0 |     22
                 |   0.10 |   0.00 |   1.65 |   0.51 |   0.00 |   0.00 |   2.27
                 |   4.55 |   0.00 |  72.73 |  22.73 |   0.00 |   0.00 |
                 |   0.18 |   0.00 |  41.03 |  13.89 |   0.00 |   0.00 |
-----------------+--------+--------+--------+--------+--------+--------+
Divorced         |      1 |      2 |      0 |     26 |      0 |      0 |     29
                 |   0.10 |   0.21 |   0.00 |   2.68 |   0.00 |   0.00 |   2.99
                 |   3.45 |   6.90 |   0.00 |  89.66 |   0.00 |   0.00 |
                 |   0.18 |   2.41 |   0.00 |  72.22 |   0.00 |   0.00 |
-----------------+--------+--------+--------+--------+--------+--------+
Widowed          |      1 |      0 |      0 |      0 |      6 |      0 |      7
                 |   0.10 |   0.00 |   0.00 |   0.00 |   0.62 |   0.00 |   0.72
                 |  14.29 |   0.00 |   0.00 |   0.00 |  85.71 |   0.00 |
                 |   0.18 |   0.00 |   0.00 |   0.00 | 100.00 |   0.00 |
-----------------+--------+--------+--------+--------+--------+--------+
Single (never ma |     22 |     17 |      0 |      2 |      0 |    264 |    305
rried)           |   2.27 |   1.75 |   0.00 |   0.21 |   0.00 |  27.19 |  31.41
                 |   7.21 |   5.57 |   0.00 |   0.66 |   0.00 |  86.56 |
                 |   4.05 |  20.48 |   0.00 |   5.56 |   0.00 | 100.00 |
-----------------+--------+--------+--------+--------+--------+--------+
Total                 543       83       39       36        6      264      971
                    55.92     8.55     4.02     3.71     0.62    27.19   100.00

Frequency Missing = 29

                        Exclude missing values in the cross-tabulations

TABLE OF REGRE25A BY REGRE25C

REGRE25A(Region - 1992)     REGRE25C(Region - 1994)

Frequency        |
Percent          |
Row Pct          |
Col Pct          |Atlantic|Quebec  |Ontario |Prairies|British |  Total
                 |        |        |        |        |Columbia|
-----------------+--------+--------+--------+--------+--------+
Atlantic         |    235 |      2 |      3 |      3 |      3 |    246
                 |  23.67 |   0.20 |   0.30 |   0.30 |   0.30 |  24.77
                 |  95.53 |   0.81 |   1.22 |   1.22 |   1.22 |
                 | 100.00 |   1.12 |   1.26 |   1.24 |   3.00 |
-----------------+--------+--------+--------+--------+--------+
Quebec           |      0 |    176 |      0 |      0 |      1 |    177
                 |   0.00 |  17.72 |   0.00 |   0.00 |   0.10 |  17.82
                 |   0.00 |  99.44 |   0.00 |   0.00 |   0.56 |
                 |   0.00 |  98.88 |   0.00 |   0.00 |   1.00 |
-----------------+--------+--------+--------+--------+--------+
Ontario          |      0 |      0 |    232 |      2 |      2 |    236
                 |   0.00 |   0.00 |  23.36 |   0.20 |   0.20 |  23.77
                 |   0.00 |   0.00 |  98.31 |   0.85 |   0.85 |
                 |   0.00 |   0.00 |  97.07 |   0.83 |   2.00 |
-----------------+--------+--------+--------+--------+--------+
Prairies         |      0 |      0 |      4 |    236 |      3 |    243
                 |   0.00 |   0.00 |   0.40 |  23.77 |   0.30 |  24.47
                 |   0.00 |   0.00 |   1.65 |  97.12 |   1.23 |
                 |   0.00 |   0.00 |   1.67 |  97.93 |   3.00 |
-----------------+--------+--------+--------+--------+--------+
British Columbia |      0 |      0 |      0 |      0 |     91 |     91
                 |   0.00 |   0.00 |   0.00 |   0.00 |   9.16 |   9.16
                 |   0.00 |   0.00 |   0.00 |   0.00 | 100.00 |
                 |   0.00 |   0.00 |   0.00 |   0.00 |  91.00 |
-----------------+--------+--------+--------+--------+--------+
Total                 235      178      239      241      100      993
                    23.67    17.93    24.07    24.27    10.07   100.00

Frequency Missing = 7

[c] Two (2) Cross-tabulations using weighted data

Note: the tables SEX21 BY IMMST15 and REGRE25A BY REGRE25C are presented in weighted form as well (using the ELGW26c variable as the weight).

SLID.lj9394 - Part           Exclude missing values in the cross-tabulations
WEIGHT BY ELGW26C - longitudinal weight
TABLE OF SEX21 BY IMMST15

SEX21(Sex)      IMMST15(Immigrant)

Frequency      |
Percent        |
Row Pct        |
Col Pct        |Yes     |No      |  Total
               |        |        |
---------------+--------+--------+
Male           |  44755 | 395099 | 439854
               |   5.87 |  51.84 |  57.71
               |  10.18 |  89.82 |
               |  57.48 |  57.73 |
---------------+--------+--------+
Female         |  33105 | 289259 | 322364
               |   4.34 |  37.95 |  42.29
               |  10.27 |  89.73 |
               |  42.52 |  42.27 |
---------------+--------+--------+
Total           77860.2   684358   762218
                  10.21    89.79   100.00

Frequency Missing = 9266.3419

 
SLID.lj9394 - Part Exclude missing values in the cross-tabulations
WEIGHT BY ELGW26C - longitudinal weight

 
TABLE OF REGRE25A BY REGRE25C
REGRE25A(Region - 1992)     REGRE25C(Region - 1994)
Frequency        |
Percent          |
Row Pct          |
Col Pct          |Atlantic|Quebec  |Ontario |Prairies|British |  Total
                 |        |        |        |        |Columbia|
-----------------+--------+--------+--------+--------+--------+
Atlantic         |  76634 |  601.5 | 429.28 | 2020.9 | 1254.4 |  80940
                 |   9.97 |   0.08 |   0.06 |   0.26 |   0.16 |  10.53
                 |  94.68 |   0.74 |   0.53 |   2.50 |   1.55 |
                 | 100.00 |   0.38 |   0.15 |   1.45 |   1.09 |
-----------------+--------+--------+--------+--------+--------+
Quebec           |      0 | 159612 |      0 |      0 | 812.81 | 160424
                 |   0.00 |  20.76 |   0.00 |   0.00 |   0.11 |  20.87
                 |   0.00 |  99.49 |   0.00 |   0.00 |   0.51 |
                 |   0.00 |  99.62 |   0.00 |   0.00 |   0.71 |
-----------------+--------+--------+--------+--------+--------+
Ontario          |      0 |      0 | 274094 | 1735.7 | 1839.8 | 277670
                 |   0.00 |   0.00 |  35.65 |   0.23 |   0.24 |  36.12
                 |   0.00 |   0.00 |  98.71 |   0.63 |   0.66 |
                 |   0.00 |   0.00 |  98.87 |   1.24 |   1.60 |
-----------------+--------+--------+--------+--------+--------+
Prairies         |      0 |      0 | 2704.4 | 135686 | 3384.2 | 141775
                 |   0.00 |   0.00 |   0.35 |  17.65 |   0.44 |  18.44
                 |   0.00 |   0.00 |   1.91 |  95.71 |   2.39 |
                 |   0.00 |   0.00 |   0.98 |  97.31 |   2.94 |
-----------------+--------+--------+--------+--------+--------+
British Columbia |      0 |      0 |      0 |      0 | 107931 | 107931
                 |   0.00 |   0.00 |   0.00 |   0.00 |  14.04 |  14.04
                 |   0.00 |   0.00 |   0.00 |   0.00 | 100.00 |
                 |   0.00 |   0.00 |   0.00 |   0.00 |  93.67 |
-----------------+--------+--------+--------+--------+--------+
Total             76634.1   160213   277228   139443   115222   768740
                     9.97    20.84    36.06    18.14    14.99   100.00

Frequency Missing = 2743.9987

[d] Summary Descriptive Statistics (unweighted data)

Note: in the average income table: if the means, standard deviations and maximum values do not agree with the values given below, one of the variables must contain an incorrect missing value specification!

SLID.lj9394 - Part  
Means for  income
Variable  Label                          N  Nmiss Minimum  Maximum    Mean    Std Dev
----------------------------------------------------------------------------------------
TTINC27B  EF-Total money income - 1993  989  11   260.00  220873.00  53947.14  31442.81
TTINC27C  EF-Total money income - 1994  984  16   170.00  245190.00  53179.54  31510.56
ATINC27B  EF-After-tax income - 1993    989  11   260.00  190866.00  43739.22  23519.87
ATINC27C  EF-After-tax income - 1994    984  16   170.00  198336.00  43073.34  23479.36
-----------------------------------------------------------------------------------------