function to remove duplicate elements from list ?

J

Jimka

Guest
I thought of two really elegant ways of removing duplicate elements:

;; do it recursively
(defun uniqify (elements)
(cond ((memq (car elements) (cdr elements))
(uniqify (cdr elements)))
(elements
(cons (car elements) (uniqify (cdr elements))))))



;; and other elegant way that does not depend on a recursive function
(defun uniqify (elements)
(apply 'nconc
(foreach maplist sub elements
(unless (member (car sub) (cdr sub))
(list (car sub))))))
 
Jimka wrote:
I thought of two really elegant ways of removing duplicate elements:

;; do it recursively
(defun uniqify (elements)
(cond ((memq (car elements) (cdr elements))
(uniqify (cdr elements)))
(elements
(cons (car elements) (uniqify (cdr elements))))))



;; and other elegant way that does not depend on a recursive function
(defun uniqify (elements)
(apply 'nconc
(foreach maplist sub elements
(unless (member (car sub) (cdr sub))
(list (car sub))))))
I notice that the recursive solution has a tiny bug (well, it's not
really a bug, it's a skill bug in my opinion). It should use member()
instead of memq(). I will be slower, but it will work with this input:

inputList = list(
"abs"
"range"
"record"
"abs"
)

Otherwise it will not uniqify "abs" !

It seems that Cadence stores reserved words in an odd way ... memq()
has a hard time with these strings that are actual skill functions!

Nicolas
 
On 15 Mar 2006 13:12:44 -0800, nicolasperrier@gmail.com wrote:

Jimka wrote:
I thought of two really elegant ways of removing duplicate elements:

;; do it recursively
(defun uniqify (elements)
(cond ((memq (car elements) (cdr elements))
(uniqify (cdr elements)))
(elements
(cons (car elements) (uniqify (cdr elements))))))



;; and other elegant way that does not depend on a recursive function
(defun uniqify (elements)
(apply 'nconc
(foreach maplist sub elements
(unless (member (car sub) (cdr sub))
(list (car sub))))))

I notice that the recursive solution has a tiny bug (well, it's not
really a bug, it's a skill bug in my opinion). It should use member()
instead of memq(). I will be slower, but it will work with this input:

inputList = list(
"abs"
"range"
"record"
"abs"
)

Otherwise it will not uniqify "abs" !

It seems that Cadence stores reserved words in an odd way ... memq()
has a hard time with these strings that are actual skill functions!

Nicolas
It's not particularly that it's storing reserved words in a special way - but
you should never use memq with strings. It tries to optimise storage of strings
(being immutable), but doesn't guarantee that two stings which are the same will
be stored in the same location.

memq (and indeed eq) should only ever be used with symbols and lists.

Andrew.
 
procedure(uniqueList(list)
let((out)
setof(x list unless(member(x out) out = cons(x out)))
) ;let
) ;procedure



Andrew Beckett wrote:
On 15 Mar 2006 13:12:44 -0800, nicolasperrier@gmail.com wrote:


Jimka wrote:

I thought of two really elegant ways of removing duplicate elements:

;; do it recursively
(defun uniqify (elements)
(cond ((memq (car elements) (cdr elements))
(uniqify (cdr elements)))
(elements
(cons (car elements) (uniqify (cdr elements))))))



;; and other elegant way that does not depend on a recursive function
(defun uniqify (elements)
(apply 'nconc
(foreach maplist sub elements
(unless (member (car sub) (cdr sub))
(list (car sub))))))

I notice that the recursive solution has a tiny bug (well, it's not
really a bug, it's a skill bug in my opinion). It should use member()
instead of memq(). I will be slower, but it will work with this input:

inputList = list(
"abs"
"range"
"record"
"abs"
)

Otherwise it will not uniqify "abs" !

It seems that Cadence stores reserved words in an odd way ... memq()
has a hard time with these strings that are actual skill functions!

Nicolas


It's not particularly that it's storing reserved words in a special way - but
you should never use memq with strings. It tries to optimise storage of strings
(being immutable), but doesn't guarantee that two stings which are the same will
be stored in the same location.

memq (and indeed eq) should only ever be used with symbols and lists.

Andrew.

--
----------------------------------------------------------
Dominic DuVarney Phone: (512) 418-5762

Centaur Technology Email: duvarney at centtech dot com
----------------------------------------------------------
 
Dominic Duvarney wrote:
After thinking about it last night, I realized that for very large lists,
it would be much faster to avoid member(). ...
I am booked timewise or I would have posted this sooner so that
everyone benefits.
It might be helpful to all if SKILL experts compared & contrasted the
various methods.
The result could be instructive for all.

John Gianni
-- Nothing I post here is prior sanctioned by my employer!
;
***************************************************************************
; File Name: MyUnique.il
; Function Name: MyUnique()
; Title: Example uniquify SKILL routines for illustrative purposes
; Author: John Gianni compiled this file from other people's material
; Date: March 12, 2006
; Revision: 0.6
; Portability: PASSES the obligatory Tools->SKILL->Survey
; Compatibility: No incompatibly changed, deleted, or private
functions!
; SW Releases Surveyed: IC440 to IC610 inclusive
; Prerequisites; none
; Synopsys; MyUnique(list(2 5 3 2 8 2 1))
; MyUnique(list( "abs" "record" "range" "abs"))
; Documentation: none
; Support: none
; Warranty: none
; Description: Provide simple sample SKILL snippets to uniquify a given
list.
; Version History:
; - 0.1 3/12/2006 2 samples posted by Jim Newton to comp.cad.cadence
; - 0.2 3/13/2006 2 additional samples added by Dave Gluss to John
Gianni
; - 0.3 3/14/2006 2 further examples added by Jim Newton to John Gianni
; - 0.4 3/15/2006 Clarifications added by Jim Newton to John Gianni
; - 0.5 3/17/2006 Nicholas Perrier added 1 example to comp.cad.cadence
; - 0.6 3/21/2006 Dominic Duvarney added 2 examples to comp.cad.cadence
;
***************************************************************************
;
===========================================================================
; If n is the number of elements in the list, the following recursive
; procedure preserves the original list right-to-left order
; (1 2 3 1)-->(2 3 1); but it will only sort lists of a length less
; than the maximum recursion depth.
; Time-wise, it evaluates on the order N2 time.
; There may be a bit of additional garbage collection overhead.
; Don't use memq for anything other than symbols (see why below).
;
; MyUnique(list(2 5 3 2 8 2 1))
; REPORTS: (5 3 8 2 1)"
;
; MyUnique(list( "abs" "record" "range" "abs"))
; REPORTS: ("abs" "record" "range" "abs")
;
---------------------------------------------------------------------------
(defun MyUnique (elements)
(cond ((memq (car elements) (cdr elements))
(MyUnique (cdr elements)))
(elements
(cons (car elements) (MyUnique (cdr elements))))))
;
===========================================================================
; This adaptation of the recursive routine above uses "member" instead
; of using "memq" because memq should not be used with strings as memq
; (and eq) optimise storage of strings (being immutable), but they do
; not guarantee that two stings which are the same will be stored in
; the same location.
; However, this routine runs slower than that shown above.
;
; MyUnique(list(2 5 3 2 8 2 1))
; REPORTS: (5 3 8 2 1)
;
; MyUnique(list( "abs" "record" "range" "abs"))
; REPORTS: ("record" "range" "abs")
;
---------------------------------------------------------------------------
(defun MyUnique (elements)
(cond ((member (car elements) (cdr elements))
(MyUnique (cdr elements)))
(elements
(cons (car elements) (MyUnique (cdr elements))))))
;
===========================================================================
; The following sample preserves right-to-left order (1 2 3 1)-->(2 3
1);
; and it improves upon the garbage collection aspect; but it still is
of
; order N2 in time.
;
; MyUnique(list(2 5 3 2 8 2 1))
; REPORTS: (5 3 8 2 1)
;
; MyUnique(list( "abs" "record" "range" "abs"))
; REPORTS: ("record" "range" "abs")
;
---------------------------------------------------------------------------
(defun MyUnique (elements)
(apply 'nconc
(foreach maplist sub elements
(unless (member (car sub) (cdr sub))
(list (car sub))))))
;
===========================================================================
; This sample destroys the original list.
; However, the sort time is improved to N log N time.
; Note that the "setcdr" function fails a Tools->SKILL Survey for IC434
; but that setcdr exists in all releases from IC440 up to & including
IC610.
;
; MyUnique(list(2 5 3 2 8 2 1) 'lessp)
; REPORTS: (1 2 3 5 8)
;
; MyUnique(list( "abs" "record" "range" "abs") nil)
; REPORTS: ("abs" "range" "record")
;
---------------------------------------------------------------------------
procedure(MyUnique(L compare)
L=sort(L compare)
let(((p L))
while(p
if(car(p) == cadr(p) then
setcdr(p cddr(p))
else
p = cdr(p))))
L)
;
===========================================================================
; The following sample destroys the original list and preserves
; left-to-right order (1 2 3 1)-->(1 2 3). The strength of this sample
; is the sort time becomes on the order of N time. This might be useful

; for larger lists; however, some objects in SKILL (such as CDF
objects)
; are not hash-able.
;
; MyUnique(list(2 5 3 2 8 2 1))
; REPORTS: (2 5 3 8 1)
;
; MyUnique(list( "abs" "record" "range" "abs"))
; REPORTS: ("abs" "record" "range")
;
---------------------------------------------------------------------------
procedure(MyUnique(L)
let(((ht makeTable("uniquing" nil)) (q L) (p cdr(L)))
ht[car(L)] = t
while(p
if(ht[car(p)] then
p = cdr(p)
setcdr(q p)
else
ht[car(p)] = t
q = p
p = cdr(p)
))
L))
;
===========================================================================
; The sample below is a further variation of the above algorithm which
; is order N yet non-destructive - and it preserves the left-to-right
order.
;
; MyUnique(list(2 5 3 2 8 2 1))
; REPORTS: (2 5 3 8 1)
;
; MyUnique(list( "abs" "record" "range" "abs"))
; REPORTS: ("abs" "record" "range")
;
---------------------------------------------------------------------------
(defun MyUnique (L)
(let ((hash (makeTable 'hash nil))
(conc_list (list nil)))
(foreach element L
(unless (arrayref hash element)
(setarray hash element
(tconc conc_list element))))
(car conc_list)))
;
===========================================================================
; This sample is also non destructive and is of the order N in time;
; but it does not preserve order at all.
;
; MyUnique(list(2 5 3 2 8 2 1))
; REPORTS: (5 3 2 1 8)
;
; MyUnique(list( "abs" "record" "range" "abs"))
; REPORTS: ("range" "abs" "record")
;
---------------------------------------------------------------------------
(defun MyUnique (L)
(let ((hash (makeTable 'hash nil)))
(foreach element L
hash[element]=t)
hash->?))
;
===========================================================================
; Added by Dominic Duvarney
; Time is of order N2.
;
; MyUnique(list(2 5 3 2 8 2 1))
; REPORTS: (2 5 3 8 1)
;
; MyUnique(list( "abs" "record" "range" "abs"))
; REPORTS: ("abs" "record" "range")
;
---------------------------------------------------------------------------
procedure(MyUnique(list)
let((out)
setof(x list unless(member(x out) out = cons(x out)))
) ;let
) ;procedure
;
===========================================================================
; Avoiding member improves the speed for larger lists of the above
; from order N2 to linear order N at the expense of construction
; and garbage collection of data.
;
; MyUnique(list(2 5 3 2 8 2 1))
; REPORTS: (2 5 3 8 1)
;
; MyUnique(list( "abs" "record" "range" "abs"))
; REPORTS: ("abs" "record" "range")
;
---------------------------------------------------------------------------
procedure(MyUnique(lst)
let((tmpTable outList)
tmpTable = makeTable("tmp" nil)
foreach(x lst
unless(tmpTable[x]
outList = cons(x outList)
tmpTable[x] = t
)
)
reverse(outList)
)
)
;
===========================================================================
; End of:
; MyUnique.il
;
========================================================================
 
Dominic,

I did not profile anything, but if there is any significant time spend
cons'ing , or reversing, you can try tconc instead :

procedure( Uniq(l)
let( ( ul )
ul=list(nil)
foreach( e l
member(e car(ul)) || tconc(ul e)
);foreach
car(ul)
));let proc


Kind regards,
--
Frederic Oghdayan
Certified provider of not funny and not cute posts (tm).
Honorary worst SKILL identer 2005 -out of competition-

Dominic Duvarney wrote:
After thinking about it last night, I realized that for very large lists,
it would be much faster to avoid member(). I ran a couple of test cases
and the function below is by far quicker than my previous post.

procedure(uniqueList2(lst)
let((tmpTable outList)
tmpTable = makeTable("tmp" nil)
foreach(x lst
unless(tmpTable[x]
outList = cons(x outList)
tmpTable[x] = t
)
)
reverse(outList)
)
)


Dominic Duvarney wrote:

procedure(uniqueList(list)
let((out)
setof(x list unless(member(x out) out = cons(x out)))
) ;let
) ;procedure



Andrew Beckett wrote:

On 15 Mar 2006 13:12:44 -0800, nicolasperrier@gmail.com wrote:


Jimka wrote:

I thought of two really elegant ways of removing duplicate elements:

;; do it recursively
(defun uniqify (elements)
(cond ((memq (car elements) (cdr elements))
(uniqify (cdr elements)))
(elements
(cons (car elements) (uniqify (cdr elements))))))



;; and other elegant way that does not depend on a recursive function
(defun uniqify (elements)
(apply 'nconc
(foreach maplist sub elements
(unless (member (car sub) (cdr sub))
(list (car sub))))))



I notice that the recursive solution has a tiny bug (well, it's not
really a bug, it's a skill bug in my opinion). It should use member()
instead of memq(). I will be slower, but it will work with this input:

inputList = list(
"abs"
"range"
"record"
"abs"
)

Otherwise it will not uniqify "abs" !

It seems that Cadence stores reserved words in an odd way ... memq()
has a hard time with these strings that are actual skill functions!

Nicolas




It's not particularly that it's storing reserved words in a special
way - but
you should never use memq with strings. It tries to optimise storage
of strings
(being immutable), but doesn't guarantee that two stings which are
the same will
be stored in the same location.

memq (and indeed eq) should only ever be used with symbols and lists.

Andrew.
 
Hi Frederic,

I created the following test function to compare various options for implementing
the unique list function. As I get different solutions, I plug them in to see run
time differences. The reversing of the final list appears to be negligible in this case.
Perhaps under different circumstances, it would matter more.

procedure( Uniq(l)
let( ( ul )
ul=list(nil)
foreach( e l
member(e car(ul)) || tconc(ul e)
);foreach
car(ul)
));let proc

procedure(uniqueList2(lst)
let((tmpTable outList)
tmpTable = makeTable("tmp" nil)
foreach(x lst
unless(tmpTable[x]
outList = cons(x outList)
tmpTable[x] = t
)
)
reverse(outList)
)
)


***********function to test uniquify functions*******************


procedure(testUniqueList()
let((smallList bigList startTime newList newList2)
for(i 0 1000
smallList = cons(i smallList)
)
for(i 1 500
bigList = append(smallList bigList)
)
printf("bigList length --> %L\n" length(bigList))

startTime = getCurrentTime()
newList2= Uniq(bigList)
printf("Uniq took %L seconds \n" compareTime( getCurrentTime() startTime))

startTime = getCurrentTime()
newList= uniqueList2(bigList)
printf("UniqueList2 took %L seconds \n" compareTime( getCurrentTime() startTime))

if(newList2== newList then
println("the lists match")
else
println("the lists don't match")
)
t
)
)


testUniqueList()
bigList length --> 500500
Uniq took 15 seconds
UniqueList2 took 0 seconds
"the lists match"
t


fogh wrote:
Dominic,

I did not profile anything, but if there is any significant time spend
cons'ing , or reversing, you can try tconc instead :

procedure( Uniq(l)
let( ( ul )
ul=list(nil)
foreach( e l
member(e car(ul)) || tconc(ul e)
);foreach
car(ul)
));let proc


Kind regards,
 
After thinking about it last night, I realized that for very large lists,
it would be much faster to avoid member(). I ran a couple of test cases
and the function below is by far quicker than my previous post.

procedure(uniqueList2(lst)
let((tmpTable outList)
tmpTable = makeTable("tmp" nil)
foreach(x lst
unless(tmpTable[x]
outList = cons(x outList)
tmpTable[x] = t
)
)
reverse(outList)
)
)


Dominic Duvarney wrote:
procedure(uniqueList(list)
let((out)
setof(x list unless(member(x out) out = cons(x out)))
) ;let
) ;procedure



Andrew Beckett wrote:

On 15 Mar 2006 13:12:44 -0800, nicolasperrier@gmail.com wrote:


Jimka wrote:

I thought of two really elegant ways of removing duplicate elements:

;; do it recursively
(defun uniqify (elements)
(cond ((memq (car elements) (cdr elements))
(uniqify (cdr elements)))
(elements
(cons (car elements) (uniqify (cdr elements))))))



;; and other elegant way that does not depend on a recursive function
(defun uniqify (elements)
(apply 'nconc
(foreach maplist sub elements
(unless (member (car sub) (cdr sub))
(list (car sub))))))


I notice that the recursive solution has a tiny bug (well, it's not
really a bug, it's a skill bug in my opinion). It should use member()
instead of memq(). I will be slower, but it will work with this input:

inputList = list(
"abs"
"range"
"record"
"abs"
)

Otherwise it will not uniqify "abs" !

It seems that Cadence stores reserved words in an odd way ... memq()
has a hard time with these strings that are actual skill functions!

Nicolas



It's not particularly that it's storing reserved words in a special
way - but
you should never use memq with strings. It tries to optimise storage
of strings
(being immutable), but doesn't guarantee that two stings which are the
same will
be stored in the same location.

memq (and indeed eq) should only ever be used with symbols and lists.

Andrew.

--
----------------------------------------------------------
Dominic DuVarney Phone: (512) 418-5762

Centaur Technology Email: duvarney at centtech dot com
----------------------------------------------------------
 

Welcome to EDABoard.com

Sponsor

Back
Top