Remove Duplicates from Linked List

If the list is sorted (non-decreasing values), duplicates sit next to each other. You can remove them in one pass: whenever cur->data == cur->next->data, skip and free the next node; otherwise advance cur. Time O(n), extra space O(1). If the list is not sorted, adjacent equal values are not guaranteed—you typically use a hash set (O(n) time, O(n) space) or sort first (often O(n log n)), depending on constraints.

C program (sorted list, in-place removal)

The list is built with insert_begin as in insert at beginning. Inserts in order 30, 20, 20, 20, 10 produce values 10 → 20 → 20 → 20 → 30 (sorted ascending). remove_duplicates_sorted(head) collapses runs of equal keys to a single node. main prints before and after, then free_list.

  • head == NULL: nothing to do; the loop exits immediately.
  • After removing a duplicate, do not advance cur yet—the new cur->next might still match cur->data (e.g. three 20s in a row).
#include <stdio.h>
#include <stdlib.h>

struct Node {
    int data;
    struct Node *next;
};

struct Node *insert_begin(struct Node *head, int data)
{
    struct Node *newNode = malloc(sizeof(struct Node));
    if (newNode == NULL)
        return head;

    newNode->data = data;
    newNode->next = head;
    return newNode;
}

void remove_duplicates_sorted(struct Node *head)
{
    struct Node *cur = head;

    while (cur != NULL && cur->next != NULL) {
        if (cur->data == cur->next->data) {
            struct Node *dup = cur->next;
            cur->next = dup->next;
            free(dup);
        } else {
            cur = cur->next;
        }
    }
}

void print_list(struct Node *head)
{
    struct Node *cur = head;
    while (cur != NULL) {
        printf("%d ", cur->data);
        cur = cur->next;
    }
    printf("\n");
}

void free_list(struct Node *head)
{
    struct Node *cur = head;
    while (cur != NULL) {
        struct Node *next = cur->next;
        free(cur);
        cur = next;
    }
}

int main(void)
{
    struct Node *head = NULL;

    head = insert_begin(head, 30);
    head = insert_begin(head, 20);
    head = insert_begin(head, 20);
    head = insert_begin(head, 20);
    head = insert_begin(head, 10);

    printf("before: ");
    print_list(head);   /* 10 20 20 20 30 */

    remove_duplicates_sorted(head);

    printf("after:  ");
    print_list(head);   /* 10 20 30 */

    free_list(head);
    return 0;
}

Program output

before: 10 20 20 20 30
after: 10 20 30

Overall program flow

Step Operation head Notes
1 head = NULL, then five insert_begin calls 0x5000 List 10 → 20 → 20 → 20 → 30
2 printf("before: "); print_list(head) Unchanged Output: before: 10 20 20 20 30
3 remove_duplicates_sorted(head) Unchanged Two duplicate nodes freed; list 10 → 20 → 30
4 printf("after: "); print_list(head) Unchanged Output: after: 10 20 30
5 free_list(head) Freed Remaining three nodes released

Pointer walk inside remove_duplicates_sorted (initial list)

Example addresses: 10 at 0x5000, three 20s at 0x4000, 0x3000, 0x2000, 30 at 0x1000.

Phase cur Action List shape (values)
start0x5000 (10)10 ≠ 20cur = cur->next10 20 20 20 30
at first 200x400020 == 20 → free 0x3000, splice10 20 20 30
still first 200x400020 == 20 → free 0x2000, splice10 20 30
still first 200x400020 ≠ 30cur = cur->next10 20 30
at 300x1000cur->next == NULL → exit10 20 30

Memory layout (example addresses, before removal)

Five nodes (insert order last-to-first: 1030)

Address Node data next
0x5000 n1 (head) 10 0x4000
0x4000 n2 20 0x3000
0x3000 n3 20 0x2000
0x2000 n4 20 0x1000
0x1000 n5 30 NULL

Visual representation (before → after)

BEFORE                          AFTER
head (0x5000)                   head (0x5000)
   │                               │
   ▼                               ▼
10 → 20 → 20 → 20 → 30 → ∅       10 → 20 → 30 → ∅
         └── duplicate run ──┘         (unique)

Detailed step-by-step: first duplicate removal

When cur points at the first 20 (0x4000) and cur->next is the second 20 (0x3000):

Line Code Before After
1 cur->data == cur->next->data Both 20 Enter if branch
2 dup = cur->next; dup is 0x3000
3 cur->next = dup->next; cur->next was 0x3000 cur->next now 0x2000 (third 20)
4 free(dup); Node 0x3000 allocated Second 20 freed
5 Loop continues; cur unchanged cur still 0x4000 Next test compares first and third 20

Complexity summary

Operation Time Extra space
remove_duplicates_sorted (sorted list) O(n) — each node visited a constant number of times O(1)
Unsorted: hash set of seen values O(n) average O(n)
Unsorted: sort then same pass O(n log n) typical O(1) to O(n) depending on sort
insert_begin O(1) per call O(1) per new node
print_list / free_list O(n) O(1)

The key takeaway is that sorted order lets you decide duplicates using only cur and cur->next—no extra structure. For unsorted data, plan for space–time tradeoffs (hash vs sort).

This code assumes non-decreasing order. Descending sorted lists work the same (duplicates still adjacent). Unsorted lists need a different strategy; stable identity of nodes is not preserved beyond “one survivor per key” in the sorted case.