Introduction to Data Structures and Algorithms (DSA)
Importance of DSA in Problem-Solving
- Efficiency: Helps write optimized and efficient code.
- Scalability: Ensures programs run smoothly even with large inputs.
- Competitive Programming: Essential for coding contests and job interviews.
- Real-world Applications: Used in AI, ML, web development, and more.
Time & Space Complexity
Time and space complexity determine how efficient an algorithm is.
Big-O Notation (O)
- Represents the worst-case scenario.
- Upper bound of an algorithm.
- Example: O(n²) for Bubble Sort.
Big-Theta Notation (Θ)
- Represents the average-case scenario.
- Tight bound (both upper and lower) on time complexity.
- Example: Θ(n log n) for Merge Sort.
Big-Omega Notation (Ω)
- Represents the best-case scenario.
- Lower bound of an algorithm.
- Example: Ω(n) for Insertion Sort in best case (sorted array).
Example: Analyzing Complexity
Code:
void exampleFunction(int n) {
for(int i = 0; i < n; i++) { // O(n)
for(int j = 0; j < n; j++) { // O(n)
cout << i << j;
}
}
}
Time Complexity = O(n * n) = O(n²)
Asymptotic Notations
- O(1) - Constant time (e.g., accessing an array element).
- O(log n) - Logarithmic time (e.g., Binary Search).
- O(n) - Linear time (e.g., Traversing an array).
- O(n log n) - Linearithmic time (e.g., Merge Sort, Quick Sort in average case).
- O(n²) - Quadratic time (e.g., Bubble Sort, Selection Sort).
Recursion & Backtracking
Recursion
- A function calls itself.
- Base case stops recursion.
- Example:
int factorial(int n) {
if (n == 0) return 1; // Base case
return n * factorial(n - 1);
}
Backtracking
- A refined form of recursion.
- Tries all possibilities and reverts if a condition fails.
- Example: Solving N-Queens problem.
bool solveNQueens(int board[][], int col) {
if (col >= N) return true;
for (int i = 0; i < N; i++) {
if (isSafe(board, i, col)) {
board[i][col] = 1;
if (solveNQueens(board, col + 1)) return true;
board[i][col] = 0; // Backtrack
}
}
return false;
}
Linear Data Structures
1. Arrays
An array is a collection of elements stored at contiguous memory locations. It is of fixed size and allows random access using indices.
Types of Arrays:
- 1D Array: A single row of elements.
- 2D Array: A matrix-like structure with rows and columns.
- Dynamic Array: Resizable array (e.g., vector in C++ and ArrayList in Java).
Example:
#include<iostream>
using namespace std;
int main() {
int arr[5] = {1, 2, 3, 4, 5};
for(int i = 0; i < 5; i++) {
cout << arr[i] << " ";
}
return 0;
}
2. Linked List
A linked list consists of nodes where each node contains data and a pointer to the next node.
Types of Linked Lists:
- Singly Linked List: Each node points to the next node.
- Doubly Linked List: Each node has pointers to both previous and next nodes.
- Circular Linked List: The last node points back to the first node.
Example:
struct Node {
int data;
Node* next;
};
3. Stacks
A stack follows the LIFO (Last In, First Out) principle. Operations:
- Push: Insert element.
- Pop: Remove top element.
- Top/Peek: Get top element.
Applications:
- Function calls (Recursion)
- Undo/Redo functionality
- Expression evaluation
Example:
#include <stack>
stack<int> s;
s.push(10);
s.pop();
4. Queues
A queue follows the FIFO (First In, First Out) principle. Types:
- Circular Queue
- Deque (Double-Ended Queue)
- Priority Queue
Example:
queue<int> q;
q.push(10);
q.pop();
Non-Linear Data Structures
1. Trees
A tree is a hierarchical data structure.
Binary Tree Traversals:
- Inorder (Left, Root, Right)
- Preorder (Root, Left, Right)
- Postorder (Left, Right, Root)
Example:
struct Node {
int data;
Node* left;
Node* right;
};
2. Graphs
A graph is a collection of nodes (vertices) and edges.
Representation:
- Adjacency Matrix
- Adjacency List
Graph Traversal:
- BFS (Breadth-First Search)
- DFS (Depth-First Search)
Example:
vector<int> adj[10];
adj[1].push_back(2);
3. Hashing
Hashing is a technique to store and retrieve data efficiently using a hash function.
Collision Handling:
- Chaining (Linked List in each bucket)
- Open Addressing (Linear Probing, Quadratic Probing, Double Hashing)
Example:
unordered_map<int, string> hashTable;
hashTable[1] = "Hello";
Algorithms
Sorting Algorithms
1. Bubble Sort
- Repeatedly swaps adjacent elements if they are in the wrong order.
- Time Complexity: O(n²)
Example:
void bubbleSort(int arr[], int n) { for (int i = 0; i < n - 1; i++) { for (int j = 0; j < n - i - 1; j++) { if (arr[j] > arr[j + 1]) { swap(arr[j], arr[j + 1]); } } } }
2. Selection Sort
- Finds the minimum element and places it at the beginning.
- Time Complexity: O(n²)
Example:
void selectionSort(int arr[], int n) { for (int i = 0; i < n - 1; i++) { int minIdx = i; for (int j = i + 1; j < n; j++) { if (arr[j] < arr[minIdx]) { minIdx = j; } } swap(arr[minIdx], arr[i]); } }
3. Insertion Sort
- Inserts elements into their correct position.
- Time Complexity: O(n²)
Example:
void insertionSort(int arr[], int n) { for (int i = 1; i < n; i++) { int key = arr[i]; int j = i - 1; while (j >= 0 && arr[j] > key) { arr[j + 1] = arr[j]; j--; } arr[j + 1] = key; } }
4. Merge Sort
- Uses Divide and Conquer to split the array and merge them in sorted order.
- Time Complexity: O(n log n)
Example:
void merge(int arr[], int l, int m, int r) { int n1 = m - l + 1, n2 = r - m; int L[n1], R[n2]; for (int i = 0; i < n1; i++) L[i] = arr[l + i]; for (int j = 0; j < n2; j++) R[j] = arr[m + 1 + j]; int i = 0, j = 0, k = l; while (i < n1 && j < n2) arr[k++] = (L[i] <= R[j]) ? L[i++] : R[j++]; while (i < n1) arr[k++] = L[i++]; while (j < n2) arr[k++] = R[j++]; } void mergeSort(int arr[], int l, int r) { if (l < r) { int m = l + (r - l) / 2; mergeSort(arr, l, m); mergeSort(arr, m + 1, r); merge(arr, l, m, r); } }
5. Quick Sort
- Selects a pivot and partitions the array into two halves.
- Time Complexity: O(n log n) (Worst case O(n²))
Example:
int partition(int arr[], int low, int high) { int pivot = arr[high]; int i = (low - 1); for (int j = low; j < high; j++) { if (arr[j] < pivot) { swap(arr[++i], arr[j]); } } swap(arr[i + 1], arr[high]); return i + 1; } void quickSort(int arr[], int low, int high) { if (low < high) { int pi = partition(arr, low, high); quickSort(arr, low, pi - 1); quickSort(arr, pi + 1, high); } }
6. Heap Sort
- Uses a binary heap to sort elements.
- Time Complexity: O(n log n)
Example:
void heapify(int arr[], int n, int i) { int largest = i, left = 2 * i + 1, right = 2 * i + 2; if (left < n && arr[left] > arr[largest]) largest = left; if (right < n && arr[right] > arr[largest]) largest = right; if (largest != i) { swap(arr[i], arr[largest]); heapify(arr, n, largest); } } void heapSort(int arr[], int n) { for (int i = n / 2 - 1; i >= 0; i--) heapify(arr, n, i); for (int i = n - 1; i > 0; i--) { swap(arr[0], arr[i]); heapify(arr, i, 0); } }
7. Counting Sort & Radix Sort
- Counting Sort: Sorts elements based on their frequency.
- Radix Sort: Sorts numbers digit by digit using Counting Sort as a subroutine.
- Time Complexity: O(n + k) (for Counting Sort)
Example:
void countingSort(int arr[], int n, int exp) { int output[n], count[10] = {0}; for (int i = 0; i < n; i++) count[(arr[i] / exp) % 10]++; for (int i = 1; i < 10; i++) count[i] += count[i - 1]; for (int i = n - 1; i >= 0; i--) { output[count[(arr[i] / exp) % 10] - 1] = arr[i]; count[(arr[i] / exp) % 10]--; } for (int i = 0; i < n; i++) arr[i] = output[i]; } void radixSort(int arr[], int n) { int max = *max_element(arr, arr + n); for (int exp = 1; max / exp > 0; exp *= 10) countingSort(arr, n, exp); }
Searching Algorithms
1. Linear Search
- Definition: A simple searching algorithm that checks each element of the list sequentially until the desired element is found.
- Time Complexity: O(n)
Example:
int linearSearch(int arr[], int n, int key) { for (int i = 0; i < n; i++) { if (arr[i] == key) return i; } return -1; }
2. Binary Search
- Definition: A searching algorithm that works on sorted arrays by repeatedly dividing the search interval in half.
- Time Complexity: O(log n)
Example:
int binarySearch(int arr[], int left, int right, int key) { while (left <= right) { int mid = left + (right - left) / 2; if (arr[mid] == key) return mid; if (arr[mid] < key) left = mid + 1; else right = mid - 1; } return -1; }
Divide & Conquer Algorithms
1. Merge Sort
- Definition: A sorting algorithm that divides the array into halves, sorts them, and merges them back.
- Time Complexity: O(n log n)
Example:
void merge(int arr[], int l, int m, int r) { // Merging logic } void mergeSort(int arr[], int l, int r) { if (l < r) { int m = l + (r - l) / 2; mergeSort(arr, l, m); mergeSort(arr, m + 1, r); merge(arr, l, m, r); } }
2. Quick Sort
- Definition: A sorting algorithm that picks a pivot and partitions the array around it.
- Time Complexity: O(n log n)
Example:
int partition(int arr[], int low, int high) { int pivot = arr[high]; int i = low - 1; for (int j = low; j < high; j++) { if (arr[j] < pivot) { swap(arr[++i], arr[j]); } } swap(arr[i + 1], arr[high]); return i + 1; }
3. Matrix Multiplication
- Definition: Multiplication of two matrices using divide and conquer (Strassen’s Algorithm).
- Time Complexity: O(n^3) (Traditional) or O(n^2.81) (Strassen)
Dynamic Programming (DP)
1. Fibonacci Series
- Definition: DP technique to avoid recomputation by storing results of subproblems.
- Time Complexity: O(n)
Example:
int fib(int n, vector<int>& dp) { if (n <= 1) return n; if (dp[n] != -1) return dp[n]; return dp[n] = fib(n - 1, dp) + fib(n - 2, dp); }
2. Knapsack Problem
- Definition: A problem where we maximize the value of items in a knapsack of given capacity.
- Time Complexity: O(nW) (where W is knapsack capacity)
Example:
int knapsack(int W, int wt[], int val[], int n) { if (n == 0 || W == 0) return 0; if (wt[n-1] > W) return knapsack(W, wt, val, n-1); return max(val[n-1] + knapsack(W - wt[n-1], wt, val, n-1), knapsack(W, wt, val, n-1)); }
3. Longest Common Subsequence (LCS)
- Definition: The longest sequence that appears in the same order in two strings.
- Time Complexity: O(mn)
Example:
int lcs(string X, string Y, int m, int n) { if (m == 0 || n == 0) return 0; if (X[m-1] == Y[n-1]) return 1 + lcs(X, Y, m-1, n-1); return max(lcs(X, Y, m, n-1), lcs(X, Y, m-1, n)); }
4. Longest Increasing Subsequence (LIS)
- Definition: The longest subsequence in which elements are in sorted order.
- Time Complexity: O(n log n)
Example:
int lis(vector<int>& nums) { vector<int> dp; for (int num : nums) { auto it = lower_bound(dp.begin(), dp.end(), num); if (it == dp.end()) dp.push_back(num); else *it = num; } return dp.size(); }
Greedy Algorithms
A Greedy Algorithm makes a series of choices, each of which looks best at the moment, with the hope that this will lead to the optimal solution. It does not always guarantee an optimal solution but works well in many problems.
1. Activity Selection Problem
Problem Statement: Given n activities with their start and finish times, select the maximum number of activities that can be performed by a single person, assuming that a person can only work on one activity at a time.
Approach:
- Sort activities by their finish time.
- Select the first activity.
- For each subsequent activity, select it if its start time is greater than or equal to the finish time of the previously selected activity.
Example:
Activity | Start | Finish |
---|---|---|
A1 | 1 | 3 |
A2 | 2 | 5 |
A3 | 4 | 6 |
A4 | 5 | 7 |
A5 | 8 | 9 |
Selected Activities: A1, A3, A5
2. Huffman Coding
Problem Statement: Given a set of characters and their frequencies, construct a binary tree-based encoding scheme that minimizes the total encoded length.
Approach:
- Create a min-heap of all characters based on their frequency.
- Extract the two nodes with the lowest frequency and combine them into a new node with their sum.
- Repeat until only one node remains (root of the Huffman tree).
- Assign 0 to the left edge and 1 to the right edge to get the Huffman code.
Example:
Character | Frequency | Huffman Code |
---|---|---|
A | 5 | 11 |
B | 9 | 10 |
C | 12 | 0 |
D | 13 | 110 |
E | 16 | 111 |
3. Kruskal’s & Prim’s Algorithm (Minimum Spanning Tree - MST)
Kruskal’s Algorithm:
- Sort all edges by weight.
- Pick the smallest edge that does not form a cycle.
- Repeat until V-1 edges are selected (where V is the number of vertices).
Prim’s Algorithm:
- Start from any node.
- Select the smallest edge that connects a new vertex to the MST.
- Repeat until all vertices are included.
Example (Graph Representation):
A---1---B
/ \ |
2 3 4
/ \ |
C---5---D---E
- Kruskal’s MST: Select edges: (A-B), (A-C), (A-D), (D-E)
- Prim’s MST (starting from A): (A-B), (A-C), (A-D), (D-E)
Backtracking Algorithms
Backtracking is a recursive approach to solving problems by trying all possible solutions and undoing (backtracking) when an invalid solution is encountered.
1. N-Queens Problem
Problem Statement: Place N queens on an N×N chessboard so that no two queens attack each other.
Approach:
- Place queens one by one in different columns.
- If a queen can be placed safely, proceed to the next column.
- If no safe placement is found, backtrack and try a different position.
Example (4-Queens Solution):
. Q . .
. . . Q
Q . . .
. . Q .
2. Sudoku Solver
Problem Statement: Given a 9×9 Sudoku board, fill in the empty cells while following the rules:
- Each row, column, and 3×3 subgrid must contain digits 1-9 without repetition.
Approach:
- Find an empty cell.
- Try placing numbers 1-9.
- If a number is valid, move to the next empty cell.
- If no number works, backtrack.
3. Rat in a Maze
Problem Statement: Given a N×N maze with walls and open paths, find a path from the top-left to the bottom-right using only right and down moves.
Approach:
- Start at (0,0), move in possible directions.
- If the destination is reached, print the path.
- If a move is invalid, backtrack and try a different path.
Example:
1 0 0 0
1 1 0 1
0 1 0 0
1 1 1 1
Valid Path: Down → Right → Down → Right → Down → Right
Advanced Topics
Competitive Programming Topics
1. Sliding Window Technique
- Used for problems involving contiguous subarrays.
- Example: Finding the maximum sum of a subarray of size k.
Example (Max Sum of k-sized Subarray)
int maxSum(int arr[], int n, int k) {
int sum = 0, maxSum = 0;
for (int i = 0; i < k; i++) sum += arr[i];
for (int i = k; i < n; i++) {
sum += arr[i] - arr[i - k];
maxSum = max(maxSum, sum);
}
return maxSum;
}
2. Two Pointers Technique
- Efficient for searching pairs in sorted arrays.
- Example: Finding if two numbers sum up to X.
Example (Pair Sum Problem)
bool hasPair(int arr[], int n, int x) {
int l = 0, r = n - 1;
while (l < r) {
int sum = arr[l] + arr[r];
if (sum == x) return true;
(sum < x) ? l++ : r--;
}
return false;
}
3. Disjoint Set Union (DSU)
- Efficiently unites and finds sets.
- Used in Kruskal’s algorithm and cycle detection.
Example (DSU Implementation with Path Compression & Union by Rank)
int parent[N], rank[N];
void makeSet(int v) { parent[v] = v; rank[v] = 0; }
int find(int v) { return (parent[v] == v) ? v : (parent[v] = find(parent[v])); }
void unionSets(int a, int b) {
a = find(a); b = find(b);
if (a != b) {
if (rank[a] < rank[b]) swap(a, b);
parent[b] = a;
if (rank[a] == rank[b]) rank[a]++;
}
}
4. Topological Sorting
- Used in DAGs (Directed Acyclic Graphs) for dependency resolution.
- Implemented using DFS or Kahn’s Algorithm (BFS).
Example (Topological Sorting using Kahn’s Algorithm - BFS)
vector<int> topoSort(int V, vector<int> adj[]) {
vector<int> inDegree(V, 0), res;
queue<int> q;
for (int i = 0; i < V; i++)
for (int v : adj[i]) inDegree[v]++;
for (int i = 0; i < V; i++)
if (inDegree[i] == 0) q.push(i);
while (!q.empty()) {
int u = q.front(); q.pop();
res.push_back(u);
for (int v : adj[u])
if (--inDegree[v] == 0) q.push(v);
}
return res;
}