Strings

Strings are one of the most fundamental data types in computer science.
They serve as the foundation for text processing, pattern recognition, and encoding/decoding operations in almost every system — from compilers to web applications, databases, and machine learning pipelines.

What is a string?

A string is a sequence of characters, often represented as an array of char or a sequence of Unicode code points.
Strings can store letters, digits, symbols, and whitespace, and are used to represent both human-readable text and machine-readable data (like JSON or XML).

Immutability vs Mutability

In most modern programming languages, strings are immutable, meaning that any operation that appears to modify a string actually creates a new string in memory. This has important implications for both time and space complexity when manipulating strings.

let s = "hello";
s += " world"; // Creates a new string, old string remains unchanged

Immutability matters for: • safety, because immutable strings are thread-safe and easier to reason about in concurrent environments. • predictability, because functions that operate on strings cannot inadvertently modify inputs elsewhere in the code. • performance considerations, because every modification creates a new string, which may lead to increased memory allocations and slower concatenation for large strings. Efficient string handling techniques, like using StringBuilder in Java or array joins in JavaScript, are often necessary.

Internal representation

Internally, strings are often stored as contiguous arrays of characters or bytes, sometimes with additional metadata like length and encoding. Operations like concatenation, slicing, and replacement may require copying these arrays or adjusting indices depending on mutability.

Operations and Time Complexity

The time complexity of insertion and deletion operations on strings depends heavily on where the operation occurs and whether the string is immutable or mutable:

Insert/Delete at the end of the string:

Immutable strings (e.g., JavaScript String, Python str, Java String) require creating a new string with the added/removed character, often resulting in O(n) time.
Mutable strings (e.g., C char[], Java StringBuilder, Swift String) can append or remove characters at the end in O(1) time if there is pre-allocated buffer space.
In languages with optimized concatenation (like Python’s str.join), sequences of appends can behave almost like O(1) amortized per operation.

Insert/Delete in the middle of the string:

Regardless of immutability, you need to shift all subsequent characters to accommodate the change.
This means the operation always costs O(n) in time, where n is the length of the string.

Below you can find a table that summarize all the info above and show the time complexity of common string operations:

Operation	Average Case	Worst Case
Access (indexing)	O(1)	O(1)
Search	O(n)	O(n)
Concatenation	O(n + m)	O(n + m)
Substring	O(k)	O(k)
Comparison	O(n)	O(n)
Insert (end)	O(1)	O(n)
Insert (middle)	O(n)	O(n)
Delete (end)	O(1)	O(n)
Delete (middle)	O(n)	O(n)
Reverse	O(n)	O(n)

Techniques

In this section, we’ll explore some of the most common algorithmic techniques applied to string problems.
Many of these are also applicable to arrays, but strings introduce additional challenges due to immutability, character encoding, and pattern-based logic.

Frequency Counting

One of the most versatile tools in string manipulation is frequency counting, a technique that maps each character to its number of occurrences.
It’s commonly used in problems involving:

Anagram detection
Palindrome checks
Text histograms or feature extraction in natural language processing (NLP) tasks

Example in TypeScript:

function buildFrequencyMap(s: string): Record<string, number> {
  const freq: Record<string, number> = {};
  for (const char of s) {
    freq[char] = (freq[char] ?? 0) + 1;
  }
  return freq;
}

This algorithm runs in $O(n)$ time and uses $O(1)$ auxiliary space, assuming a fixed-size alphabet (e.g., ASCII). For Unicode strings, space complexity becomes $O(k)$ , where $k$ is the number of unique characters.

Pattern Matching

Another cornerstone of string algorithms is pattern matching: finding one or more substrings inside a larger text.

The brute-force (naive) method checks each position and compares character by character. The time complexity is $O(n × m)$ , where n is the text length and m is the pattern length. Simple but inefficient for large inputs. There are also optimized algorithms for pattern matching. You can find some of the most popular ones below.

Knuth–Morris–Pratt (KMP)

Builds a prefix table (LPS array) to skip redundant comparisons.
Time complexity: $O(n + m)$
Excellent for deterministic searches.

Rabin–Karp

Uses rolling hash to compare substring hashes instead of characters.
Average: $O(n + m)$
Worst case: $O(n × m)$ (hash collisions)

Boyer–Moore

Applies heuristics (“bad character” and “good suffix”) to skip ahead.
Average: $O(n / m)$ — very efficient on natural language text.

Exercises

Exercise	Technique	Solution
Is Subsequence	Two Pointers	Solution
Valid Palindrome	Two Pointers / Filtering	Solution
Longest Common Prefix	Character Comparison / Prefix Scan	Solution
Zigzag Conversion	Simulation / Index Mapping	Solution
Reverse Words in a String	Splitting / String Manipulation	Solution
Guess the Word	Minimax / Constraint Filtering	Solution