Remove Duplicates from Linked List

If the list is sorted (non-decreasing values), duplicates sit next to each other. You can remove them in one pass: whenever cur->data == cur->next->data, skip and free the next node; otherwise advance cur. Time O(n), extra space O(1). If the list is not sorted, adjacent equal values are not guaranteed—you typically use a hash set (O(n) time, O(n) space) or sort first (often O(n log n)), depending on constraints.

C program (sorted list, in-place removal)

The list is built with insert_begin as in insert at beginning. Inserts in order 30, 20, 20, 20, 10 produce values 10 → 20 → 20 → 20 → 30 (sorted ascending). remove_duplicates_sorted(head) collapses runs of equal keys to a single node. main prints before and after, then free_list.

head == NULL: nothing to do; the loop exits immediately.
After removing a duplicate, do not advance cur yet—the new cur->next might still match cur->data (e.g. three 20s in a row).

#include <stdio.h>
#include <stdlib.h>

struct Node {
    int data;
    struct Node *next;
};

struct Node *insert_begin(struct Node *head, int data)
{
    struct Node *newNode = malloc(sizeof(struct Node));
    if (newNode == NULL)
        return head;

    newNode->data = data;
    newNode->next = head;
    return newNode;
}

void remove_duplicates_sorted(struct Node *head)
{
    struct Node *cur = head;

    while (cur != NULL && cur->next != NULL) {
        if (cur->data == cur->next->data) {
            struct Node *dup = cur->next;
            cur->next = dup->next;
            free(dup);
        } else {
            cur = cur->next;
        }
    }
}

void print_list(struct Node *head)
{
    struct Node *cur = head;
    while (cur != NULL) {
        printf("%d ", cur->data);
        cur = cur->next;
    }
    printf("\n");
}

void free_list(struct Node *head)
{
    struct Node *cur = head;
    while (cur != NULL) {
        struct Node *next = cur->next;
        free(cur);
        cur = next;
    }
}

int main(void)
{
    struct Node *head = NULL;

    head = insert_begin(head, 30);
    head = insert_begin(head, 20);
    head = insert_begin(head, 20);
    head = insert_begin(head, 20);
    head = insert_begin(head, 10);

    printf("before: ");
    print_list(head);   /* 10 20 20 20 30 */

    remove_duplicates_sorted(head);

    printf("after:  ");
    print_list(head);   /* 10 20 30 */

    free_list(head);
    return 0;
}

Program output

before: 10 20 20 20 30
after: 10 20 30

Overall program flow

Step	Operation	`head`	Notes
1	`head = NULL`, then five `insert_begin` calls	`0x5000`	List `10 → 20 → 20 → 20 → 30`
2	`printf("before: "); print_list(head)`	Unchanged	Output: `before: 10 20 20 20 30`
3	`remove_duplicates_sorted(head)`	Unchanged	Two duplicate nodes freed; list `10 → 20 → 30`
4	`printf("after: "); print_list(head)`	Unchanged	Output: `after: 10 20 30`
5	`free_list(head)`	Freed	Remaining three nodes released

Pointer walk inside `remove_duplicates_sorted` (initial list)

Example addresses: 10 at 0x5000, three 20s at 0x4000, 0x3000, 0x2000, 30 at 0x1000.

Phase	`cur`	Action	List shape (values)
start	`0x5000` (10)	`10 ≠ 20` → `cur = cur->next`	`10 20 20 20 30`
at first 20	`0x4000`	`20 == 20` → free `0x3000`, splice	`10 20 20 30`
still first 20	`0x4000`	`20 == 20` → free `0x2000`, splice	`10 20 30`
still first 20	`0x4000`	`20 ≠ 30` → `cur = cur->next`	`10 20 30`
at 30	`0x1000`	`cur->next == NULL` → exit	`10 20 30`

Memory layout (example addresses, before removal)

Five nodes (insert order last-to-first: 10 … 30)

Address	Node	`data`	`next`
`0x5000`	`n1` (head)	10	`0x4000`
`0x4000`	`n2`	20	`0x3000`
`0x3000`	`n3`	20	`0x2000`
`0x2000`	`n4`	20	`0x1000`
`0x1000`	`n5`	30	`NULL`

Visual representation (before → after)

BEFORE                          AFTER
head (0x5000)                   head (0x5000)
   │                               │
   ▼                               ▼
10 → 20 → 20 → 20 → 30 → ∅       10 → 20 → 30 → ∅
         └── duplicate run ──┘         (unique)

Detailed step-by-step: first duplicate removal

When cur points at the first 20 (0x4000) and cur->next is the second 20 (0x3000):

Line	Code	Before	After
1	`cur->data == cur->next->data`	Both `20`	Enter `if` branch
2	`dup = cur->next;`	—	`dup` is `0x3000`
3	`cur->next = dup->next;`	`cur->next` was `0x3000`	`cur->next` now `0x2000` (third `20`)
4	`free(dup);`	Node `0x3000` allocated	Second `20` freed
5	Loop continues; `cur` unchanged	`cur` still `0x4000`	Next test compares first and third `20`

Complexity summary

Operation	Time	Extra space
`remove_duplicates_sorted` (sorted list)	`O(n)` — each node visited a constant number of times	`O(1)`
Unsorted: hash set of seen values	`O(n)` average	`O(n)`
Unsorted: sort then same pass	`O(n log n)` typical	`O(1)` to `O(n)` depending on sort
`insert_begin`	`O(1)` per call	`O(1)` per new node
`print_list` / `free_list`	`O(n)`	`O(1)`

The key takeaway is that sorted order lets you decide duplicates using only cur and cur->next—no extra structure. For unsorted data, plan for space–time tradeoffs (hash vs sort).

This code assumes non-decreasing order. Descending sorted lists work the same (duplicates still adjacent). Unsorted lists need a different strategy; stable identity of nodes is not preserved beyond “one survivor per key” in the sorted case.