SAS software provides the power for records to be reviewed and exceptions reported. Along came a problem that had to be solved - a dataset with 500,000 observations was suspected to have duplicates. Presented is how a simplesolution turned into a macro that could be used on any dataset....
So, I want to find out duplicates based on two variables; 'id' and 'admissiondate' and create a data set without duplicates but before I create a data set without duplicates I need to know how many duplicate entries I have in the data set for verification purposes. Thanks S0...
Finding those duplicated values in data, records, and other sources is easy using SAS' "FIRST." and "LAST." Expressions. This paper will explain how SAS can be used to help find those dups.Clarence Wm. JacksonCSQA
And assuming, if there are more than certifydate within a fiiledate and those certifydates will be distinct, the following datastep would output 2 datasets i.e one with keeping your most recent certifydate and the other with the duplicates data want dup; set have; by id filedate ; if ...
I have the data below (an excel dataset is also attached) and I would like to check for duplicates and then remove the duplicates. For example, in the data set below ID 400 has a filedate of May_2019 and two different certify dates. I would like to identify ID’s that have duplicate...
I have the data below (an excel dataset is also attached) and I would like to check for duplicates and then remove the duplicates. For example, in the data set below ID 400 has a filedate of May_2019 and two different certify dates. I would like to identify ID’s that have duplica...
Part 2 would be to then have results that show there are 3 total matching KRAS mutations. Is that possible in SAS? Thank you so much for the help!0 Likes 1 ACCEPTED SOLUTION Astounding PROC Star Re: Finding matching observations among a variable Posted 01-08-2018 03:04 PM (510...
This only looks at absolute difference so it works for exactly the question asked but if you have ties you'll end up with duplicates in the output. proc sql; create table want as select t1.a, t1.b, t2.c, abs(t1.b - t2.b) as diff from data1 as t1 left join dat...
from have as a inner join have as b on (a.id < b.id /* prevents duplicates and self-matches */ AND ( &soundweight*(complev(a.firstnamesound,b.firstnamesound)+complev(a.lastnamesound,b.lastnamesound)) +&gedisweight*(compged(a.firstname,b.firstname)+compged(a.lastname,b.lastna...
And I am curious as to what you mean by "so that the position doesn't matter", because I tried the code with different positions of categories in a row and category duplicates, and it keeps the last occurrence of a given category regardless of position. Though being a ...