Remove Duplicates from Linked List
If the list is sorted (non-decreasing values), duplicates sit next to each other. You can remove them in one pass: whenever cur->data == cur->next->data, skip and free the next node; otherwise advance cur. Time O(n), extra space O(1). If the list is not sorted, adjacent equal values are not guaranteed—you typically use a hash set (O(n) time, O(n) space) or sort first (often O(n log n)), depending on constraints.
C program (sorted list, in-place removal)
The list is built with insert_begin as in insert at beginning. Inserts in order 30, 20, 20, 20, 10 produce values 10 → 20 → 20 → 20 → 30 (sorted ascending). remove_duplicates_sorted(head) collapses runs of equal keys to a single node. main prints before and after, then free_list.
head == NULL: nothing to do; the loop exits immediately.- After removing a duplicate, do not advance
curyet—the newcur->nextmight still matchcur->data(e.g. three20s in a row).
#include <stdio.h>
#include <stdlib.h>
struct Node {
int data;
struct Node *next;
};
struct Node *insert_begin(struct Node *head, int data)
{
struct Node *newNode = malloc(sizeof(struct Node));
if (newNode == NULL)
return head;
newNode->data = data;
newNode->next = head;
return newNode;
}
void remove_duplicates_sorted(struct Node *head)
{
struct Node *cur = head;
while (cur != NULL && cur->next != NULL) {
if (cur->data == cur->next->data) {
struct Node *dup = cur->next;
cur->next = dup->next;
free(dup);
} else {
cur = cur->next;
}
}
}
void print_list(struct Node *head)
{
struct Node *cur = head;
while (cur != NULL) {
printf("%d ", cur->data);
cur = cur->next;
}
printf("\n");
}
void free_list(struct Node *head)
{
struct Node *cur = head;
while (cur != NULL) {
struct Node *next = cur->next;
free(cur);
cur = next;
}
}
int main(void)
{
struct Node *head = NULL;
head = insert_begin(head, 30);
head = insert_begin(head, 20);
head = insert_begin(head, 20);
head = insert_begin(head, 20);
head = insert_begin(head, 10);
printf("before: ");
print_list(head); /* 10 20 20 20 30 */
remove_duplicates_sorted(head);
printf("after: ");
print_list(head); /* 10 20 30 */
free_list(head);
return 0;
}
Program output
after: 10 20 30
Overall program flow
| Step | Operation | head |
Notes |
|---|---|---|---|
| 1 | head = NULL, then five insert_begin calls |
0x5000 |
List 10 → 20 → 20 → 20 → 30 |
| 2 | printf("before: "); print_list(head) |
Unchanged | Output: before: 10 20 20 20 30 |
| 3 | remove_duplicates_sorted(head) |
Unchanged | Two duplicate nodes freed; list 10 → 20 → 30 |
| 4 | printf("after: "); print_list(head) |
Unchanged | Output: after: 10 20 30 |
| 5 | free_list(head) |
Freed | Remaining three nodes released |
Pointer walk inside remove_duplicates_sorted (initial list)
Example addresses: 10 at 0x5000, three 20s at 0x4000, 0x3000, 0x2000, 30 at 0x1000.
| Phase | cur |
Action | List shape (values) |
|---|---|---|---|
| start | 0x5000 (10) | 10 ≠ 20 → cur = cur->next | 10 20 20 20 30 |
| at first 20 | 0x4000 | 20 == 20 → free 0x3000, splice | 10 20 20 30 |
| still first 20 | 0x4000 | 20 == 20 → free 0x2000, splice | 10 20 30 |
| still first 20 | 0x4000 | 20 ≠ 30 → cur = cur->next | 10 20 30 |
| at 30 | 0x1000 | cur->next == NULL → exit | 10 20 30 |
Memory layout (example addresses, before removal)
Five nodes (insert order last-to-first: 10 … 30)
| Address | Node | data |
next |
|---|---|---|---|
0x5000 |
n1 (head) |
10 | 0x4000 |
0x4000 |
n2 |
20 | 0x3000 |
0x3000 |
n3 |
20 | 0x2000 |
0x2000 |
n4 |
20 | 0x1000 |
0x1000 |
n5 |
30 | NULL |
Visual representation (before → after)
BEFORE AFTER
head (0x5000) head (0x5000)
│ │
▼ ▼
10 → 20 → 20 → 20 → 30 → ∅ 10 → 20 → 30 → ∅
└── duplicate run ──┘ (unique)
Detailed step-by-step: first duplicate removal
When cur points at the first 20 (0x4000) and cur->next is the second 20 (0x3000):
| Line | Code | Before | After |
|---|---|---|---|
| 1 | cur->data == cur->next->data |
Both 20 |
Enter if branch |
| 2 | dup = cur->next; |
— | dup is 0x3000 |
| 3 | cur->next = dup->next; |
cur->next was 0x3000 |
cur->next now 0x2000 (third 20) |
| 4 | free(dup); |
Node 0x3000 allocated |
Second 20 freed |
| 5 | Loop continues; cur unchanged |
cur still 0x4000 |
Next test compares first and third 20 |
Complexity summary
| Operation | Time | Extra space |
|---|---|---|
remove_duplicates_sorted (sorted list) |
O(n) — each node visited a constant number of times |
O(1) |
| Unsorted: hash set of seen values | O(n) average |
O(n) |
| Unsorted: sort then same pass | O(n log n) typical |
O(1) to O(n) depending on sort |
insert_begin |
O(1) per call |
O(1) per new node |
print_list / free_list |
O(n) |
O(1) |
The key takeaway is that sorted order lets you decide duplicates using only cur and cur->next—no extra structure. For unsorted data, plan for space–time tradeoffs (hash vs sort).
This code assumes non-decreasing order. Descending sorted lists work the same (duplicates still adjacent). Unsorted lists need a different strategy; stable identity of nodes is not preserved beyond “one survivor per key” in the sorted case.