I recently set to the task of training undergraduate students in some of the basic data management & analysis procedures available in SPSS. This post consists of the primer that I’ve used to give my research assistants a footing in working with data sets in SPSS, and specifically, how to begin working with the SPSS programming language to create scales, instead of using point-and-click commands to do your work. Up and coming students might find this material useful, as may those of you who are training up new students too. Feel free to pass it along.
Starting new syntax: To open a new syntax editor window, click FILE –> NEW –> SYNTAX. This will open up a blank syntax window that you can begin to write into. The left hand side shows a short list of the major commands in your syntax program. It will auto-update as you write new statements and commands in the main window on the right hand side. You can save syntax files to your hard drive as .sps files. Also note that you can also view the contents of any .sps file using a plain text program like Notepad (Windows) or TextEdit (Mac). This is handy when you just want to see what’s in a syntax file, but don’t have access to SPSS at the moment.
If using the point-and-click interface: As a general rule for learning purposes, if you do decide to use this interface, always run your commands by clicking the PASTE button, instead of the OK button. Clicking PASTE will send syntax to the syntax editor window, so you can actually see exactly what the SPSS is about to do for you, based on the program syntax you have just created by clicking all those buttons & ticking all those option boxes and whatnot.
Executing commands. Any operations that involve changing the data set in some way will require the use of the EXECUTE statement at the end of any syntax block. This statement tells the program to carry out whatever you have written. If you want to be lazy/efficient, you can also just type “exe.” and get the same results. This is just one of many little convenient but relatively unknown syntax shortcuts built into the program.
Common operations that change the data set include: COMPUTE, RECODE, RENAME VARIABLES, VARIABLE LABELS, and VALUE LABELS.
Using notes in syntax. As a user, you can add notes and comments to your syntax very freely. Any line that begins with an asterisk will be treated as a user comment, meaning SPSS will ignore it when running the program syntax that you write. Ending your comment with a period ‘.’ symbol will tell SPSS that you are done with your comment (as will inserting two hard returns by pressing the ‘Enter’ key). Though note that if you continue typing on the same line after a period OR if you insert only a single hard return (without a preceding period), SPSS will assume you are still writing a comment. The easiest way to tell comments apart from substantive syntax is to look at its color. When you write comments into a syntax window, they will automatically turn grey by default, alerting you to which portions of the program SPSS will ignore (this is helpful in case you accidentally forget to add a period and turn one of your important commands into a comment). The examples below contain such notes/comments.
Running your syntax. When you want SPSS to actually do something with your program syntax, you can highlight the part of the program that you want it to run, and then click the run button in the toolbar (resembles a green “Play” button).
AN EXAMPLE: SPSS syntax to compute a scale total or a subscale score:
In all cases we need to know exactly which items belong to which subscales in order to compute new variables. This information is usually found in the documentation for a scale (e.g., a codebook or questionnaire key). If not there, search for the original journal article where the scale was first published (if it is a published scale), or in the rarer cases, contact the author(s) for more information.
Example 1: When the scale(s) do not contain any reverse coded items, you can simply compute sums or averages.
Computing sums for the three subscales of the Dirty Dozen measure of The Dark Triad 1:
Note – copying and pasting this, or any other blocks of program syntax directly into SPSS would work just fine if you are using a data set that contains such variables.
*Dark triad* *Items Q23_1 to Q23_12. *Reverse coded items: None. *Narcissism = item 4,5,6,7 *Psychopathy = item 1,2,11,12 *Machiavellianism = item 3, 8, 9, 10. *Compute sums for the three subscales. As a good rule of thumb, I try to keep variable names short and sweet (8 characters or less). compute NARCISM = Q23_4 + Q23_5 + Q23_6 + Q23_7. compute PSYCHOP = Q23_1 + Q23_2 + Q23_11 + Q23_12. compute MACHVLN = Q23_3 + Q23_8 + Q23_9 + Q23_10. execute.
Example 2: When scales do contain reverse coded items, you have to reverse code items BEFORE you compute sums or averages. Doing this requires that you know two pieces of information:
- Which items need to be reverse coded
- How many scale points each item on this measure has (i.e., are the questions on a 1 to 5 scale? 1 to 9? Etc.)
Usually we get this information from the scale materials like a codebook or a questionnaire key. Knowing this, you can reverse score an item using the simple equation:
XREVi = (k+1) – Xi.
XREVi = The reversed score we are trying to compute for person “i”
k = number of scale points on the question we are trying to reverse code
Xi = the actual raw score we already have in the data set for a particular person “i”
Using this equation allows us to flip scores around, which makes them suitable for computing subscale means or sums afterwards. So, as an example, items on a 1 to 5 scale would be reversed as follows:
If X=1, we’d want it to equal 5 when reversed:
XREVi = (k+1) – Xi = (5+1) – 1 = 6 – 1 = 5
If X = 4, we’d want it to equal 2 when reversed:
XREVi = (k+1) – Xi = (5+1) – 4 = 6 – 4 = 2
Reverse coding items as needed, and then computing averages for the five subscales ten item personality inventory (TIPI) 2:
Note – in this example, I decided to give all the TIPI variables names that make more intuitive sense, by simply creating copies of the non-reverse coded items (the odd-numbered ones). This is not strictly necessary, but it is a common practice of mine. I do this so that later on, I have a much easier time working with the scales. For example, if I wanted to run a reliability analysis, I could simply pop in the new list of TIPI items, which will all be found together at the end of the data set. This is much easier than plugging in an inconsistent set of items like “Q26_1, TIPI02, Q26_3” and so forth. Creating new versions not only allows for a more cleanly structured data set, it also better enables you to pull off some of that helpful Excel integration I’ve talked about in the past.
*Personality (TIPI)* *Items Q26_1 to Q26_10. *Item Recoding or Reverse Scoring: Reversed items: 02, 04, 06, 08, 10. *Scoring the TIPI: *1. Recode the reverse-scored even items (the scale is a 1 to 7, so reverse coding requires computing (k+1)-X = (7+1)-X = 8-X). *Also create copies of the odd items, just for ease of reference. compute TIPI01 = Q26_1. compute TIPI02 = 8-Q26_2. compute TIPI03 = Q26_3. compute TIPI04 = 8-Q26_4. compute TIPI05 = Q26_5. compute TIPI06 = 8-Q26_6. compute TIPI07 = Q26_7. compute TIPI08 = 8-Q26_8. compute TIPI09 = Q26_9. compute TIPI10 = 8-Q26_10. execute. *Subscale Scoring (‘R’ = reverse scored): Openness: 5, 10R Conscientiousness: 3, 8R Extraversion: 1, 6R Agreeableness: 2R, 7 Neuroticism:4R, 9. . *Compute means for subscales now that we have reverse-scored even items. compute OPEN = (TIPI05 + TIPI10)/2. compute CONS = (TIPI03 + TIPI08)/2. compute EXTR = (TIPI01 + TIPI06)/2. compute AGRE = (TIPI02 + TIPI07)/2. compute NEUR = (TIPI04 + TIPI09)/2. execute.
Once you have successfully constructed your scales, you’re free to carry out any descriptive or inferential analyses of interest (as long as you ensure the quality of your scales beforehand, of course). My strong advice to all up and comers is to do as much of your work as possible via syntax rather than pointing and clicking. Tinker around with the syntax whenever you run any procedure, just to see how things operate under the hood. Spend enough time probing the language, and before you know it you’ll develop a repertoire of shortcuts and tricks, and be managing data sets & running analyses with ease.
2 Gosling, S. D., Rentfrow, P. J., & Swann, W. B., Jr. (2003). A Very Brief Measure of the Big Five Personality Domains. Journal of Research in Personality, 37, 504-528.