'Duplicate data based on various conditions in SAS

In the following data set I want to remove duplicates based on several conditions:

For Auris disease:

  • Same id, same condition (Auris), keep data with the first date only no matter what the date difference is.

For other disease conditions (Acino and CRE):

  • Same id, same condition, data difference more than 90 days keep data with the first date and last date if we have two dates.

  • Same id, same condition, data difference more than 90 days keep data with the first date and last date if we have three dates or more. Keep all three if the difference is more than 90 days between 1st and second, and more than 90 days between 2nd and third dates.

    data have;
    input Id  Disease $ Date :mmddyy10.;
    format date mmddyy10.;
    datalines;
    123   Auris   01/01/2021
    123   CRE     09/02/2020
    344   CRE     08/06/2019
    344   CRE     03/06/2020
    344   CRE     03/03/2021
    323   CRE     01/06/2019
    323   CRE     09/06/2020
    323   CRE     09/09/2020
    167   Acino   03/06/2020
    167   Acino   03/19/2020
    167   Acino   09/03/2021
    256   Auris   08/05/2020
    256   Auris   10/07/2021
    317   Acino   12/07/2018
    317   Acino   01/03/2018
    ;;;;
    run;
    

Result should be as this:

  123   Auris   01/01/2021
  123   CRE     09/02/2020
  344   CRE     08/06/2019
  344   CRE     03/06/2020
  344   CRE     03/03/2021
  323   CRE     01/06/2019
  323   CRE     09/06/2020
  167   Acino   03/06/2020
  167   Acino   09/03/2021
  256   Auris   08/05/2020
  256   Auris   10/07/2021
  317   Acino   12/07/2018
  

Thanks



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source