'Translate SPSS output into R script

I am trying to translate the following SPSS output into R scripts, but given my lack of experience using SPSS, I'm struggling to translate exactly what was done. As far as I'm aware, the steps were intended to:

  • select distinct by ID and Dates
  • Identify duplicate cases
SORT CASES BY ID(A) Date(A).
MATCH FILES
  /FILE=*
  /BY ID Date
  /FIRST=PrimaryFirst
  /LAST=PrimaryLast.
DO IF (PrimaryFirst).
COMPUTE  MatchSequence=1-PrimaryLast.
ELSE.
COMPUTE  MatchSequence=MatchSequence+1.
END IF.
LEAVE  MatchSequence.
FORMATS  MatchSequence (f7).
COMPUTE  InDupGrp=MatchSequence>0.
SORT CASES InDupGrp(D).
MATCH FILES
  /FILE=*
  /DROP=PrimaryFirst InDupGrp MatchSequence.
VARIABLE LABELS  PrimaryLast 'Indicator of each last matching case as Primary'.
VALUE LABELS  PrimaryLast 0 'Duplicate Case' 1 'Primary Case'.
VARIABLE LEVEL  PrimaryLast (ORDINAL).
FREQUENCIES VARIABLES=PrimaryLast.
EXECUTE.

Any advice or assistance to translate the above segment would be appreciated.



Solution 1:[1]

The syntax (not output ;)) you posted does indeed find and mark the rows where the same combination of ID and Date appears in more then one row. You can replicate this easily in R, start by looking up duplicated() function.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 eli-k