🔗 Longest Common Subsequence (LCS) – Complete In-Depth Explanation
The **Longest Common Subsequence (LCS)** problem is a fundamental **dynamic programming** problem that finds the **longest subsequence** present in **two given strings** in the **same order** but **not necessarily contiguous**.
📌 Why is LCS Important?
- ✅ **Used in DNA sequence alignment & bioinformatics.**
- 🚀 **Essential in diff tools (e.g., Git diff, text comparison).**
- 🔍 **Helps in spell checkers & error correction.**
- 📡 **Used in data compression algorithms.**
📌 Understanding LCS with an Example
Given two strings:
X = "ACDBE"
Y = "ABCDE"
The **Longest Common Subsequence** is **"ACDE"** (length = 4).
🔹 **Key Points**
- ✅ LCS is **not contiguous**, unlike substrings.
- 📉 LCS is **not unique**; multiple valid solutions may exist.
🛠 Solution Approaches
📌 **1. Recursion (Brute Force) – O(2^N) Complexity**
We compare characters from both strings:
- ✅ If characters match, **include** them in the LCS.
- 🔍 Otherwise, explore **both possibilities** (excluding either character).
int LCS(string X, string Y, int i, int j) {
if (i == 0 || j == 0) return 0;
if (X[i - 1] == Y[j - 1])
return 1 + LCS(X, Y, i - 1, j - 1);
else
return max(LCS(X, Y, i - 1, j), LCS(X, Y, i, j - 1));
}
🔴 **Disadvantage:** This solution is **exponential (O(2^N))** and infeasible for large inputs.
📌 **2. Dynamic Programming (O(N × M) Complexity)**
Instead of recomputing results, we store them in a **2D table (memoization)** to avoid redundant calculations.
🔹 **Bottom-Up DP Approach (Tabulation Method)**
int LCS_DP(string X, string Y) {
int N = X.length(), M = Y.length();
vector> dp(N + 1, vector(M + 1, 0));
for (int i = 1; i <= N; i++) {
for (int j = 1; j <= M; j++) {
if (X[i - 1] == Y[j - 1])
dp[i][j] = 1 + dp[i - 1][j - 1];
else
dp[i][j] = max(dp[i - 1][j], dp[i][j - 1]);
}
}
return dp[N][M];
}
📌 **3. Space Optimized DP (O(M) Complexity)**
Instead of using a **2D array**, we use a **1D array** and update values row-wise.
int LCS_Optimized(string X, string Y) {
int N = X.length(), M = Y.length();
vector prev(M + 1, 0), curr(M + 1, 0);
for (int i = 1; i <= N; i++) {
for (int j = 1; j <= M; j++) {
if (X[i - 1] == Y[j - 1])
curr[j] = 1 + prev[j - 1];
else
curr[j] = max(prev[j], curr[j - 1]);
}
prev = curr;
}
return prev[M];
}
📌 Printing the LCS Sequence
Using **backtracking**, we reconstruct the LCS from the DP table.
string printLCS(string X, string Y) {
int N = X.length(), M = Y.length();
vector> dp(N + 1, vector(M + 1, 0));
for (int i = 1; i <= N; i++) {
for (int j = 1; j <= M; j++) {
if (X[i - 1] == Y[j - 1])
dp[i][j] = 1 + dp[i - 1][j - 1];
else
dp[i][j] = max(dp[i - 1][j], dp[i][j - 1]);
}
}
int i = N, j = M;
string LCS_String = "";
while (i > 0 && j > 0) {
if (X[i - 1] == Y[j - 1]) {
LCS_String += X[i - 1];
i--; j--;
} else if (dp[i - 1][j] > dp[i][j - 1]) {
i--;
} else {
j--;
}
}
reverse(LCS_String.begin(), LCS_String.end());
return LCS_String;
}
⏳ Time Complexity Analysis
Algorithm | Time Complexity |
---|---|
Recursive (Brute Force) | O(2^N) |
Dynamic Programming (2D Table) | O(N × M) |
Space Optimized DP (1D Table) | O(N × M) with O(M) space |
🌍 Real-World Applications of LCS
- 📡 **Bioinformatics** – DNA sequence alignment.
- 📝 **Text Comparison** – Finding differences in documents.
- 🔍 **Data Compression** – Detecting repeated sequences.
- 💡 **Error Detection & Correction** – Spell checkers & auto-correct.