[PS] 자료구조와 알고리즘 개념 정리 (Data Structure) Part 1

내용을 보강하여 [PS] 알고리즘, 자료구조, 풀이팁, 문제리스트 포스트에 업로드하였습니다.

NOMENCLATURE

$V$ : 노드(Node, Vertex)의 개수
$E$ : 간선(Edge)의 개수
$n$ : 원소의 개수 (또는 $N$)
$\log n$ : 밑이 2인 $\log_2 n$을 간략하게 표기
노드 = 정점
간선 = 엣지

본 포스팅은 필자가 Problem Solving(PS)를 공부하면서 정리한 포스팅이다.

Data structure

Array

배열(Array)는 메모리 상에 원소를 연속하게 배치한 자료구조를 말한다. C++에서는 배열의 원소가 한 번 정해지면 ( int A[10] ) 이후 크기를 변경하는 것이 일반적으로 불가능하다.

C++ Skeleton Code (Pure) #1

#include <cstdio>

int A[10] = {10, 20, 30};
int len = 3;
  
void insert(int idx, int num) {
    for(int i=len; i>idx; i--) A[i] = A[i-1];
    A[idx] = num;
    len++;
}

void erase(int idx) {
    len--;
    for(int i=idx; i<len; i++) A[i] = A[i+1];
}

void Print() { 
    for(int i=0; i<len; i++) printf("%d ", A[i]); 
    puts(""); 
}


int main() {
  printf("insert test\n");
  insert(3, 40); Print(); // 10 20 30 40
  insert(1, 50); Print(); // 10 50 20 30 40
  insert(0, 15); Print(); // 15 10 50 20 30 40
  
  printf("erase test\n");
  erase(4); Print(); // 15 10 50 20 40 
  erase(1); Print(); // 15 50 20 40 
  erase(3); Print(); // 15 50 20 
}

동적 배열이 필요한 경우 매 번 길이가 capacity에 도달할 때마다 배열의 길이를 2배씩 증가시켜주면 삽입 연산의 시간복잡도가 Amortized $O(1)$이 된다. (사실 상 $O(1)$과 동일한 시간복잡도) (BaaaaaaaarkingDog님 동적 배열 참조)
C++ Skeleton Code (Pure) #2

#include <bits/stdc++.h>
using namespace std;

int* A;
int len = 0;       // 실제 데이터가 들어있는 크기
int capacity = 0;  // 삽입 가능한 최대 크기

void init(){
  A = new int[1];
  capacity = 1;
}

void expand(){
  int* tmp = new int[2*capacity];
  for(int i=0;i<len;i++)
    tmp[i] = A[i];
  delete[] A;
  A = tmp;
  capacity = 2*capacity;
}

void insert(int idx, int num) {
  if(len == capacity) expand();

  for(int i=len; i>idx; i--) A[i] = A[i-1];
  A[idx] = num;
  len++;
}

void erase(int idx) {
  len--;
  for(int i=idx; i<len; i++) A[i] = A[i+1];
}

void Print() {
  for(int i=0; i<len; i++) printf("%d ", A[i]);
  puts("");
}

int main() {
  init();

  printf("insert test\n");
  insert(0, 10); Print(); // 10, len = 1, capacity = 1
  insert(0, 30); Print(); // 30 10, len = 2, capacity = 2
  insert(1, 20); Print(); // 30 20 10, len = 3, capacity = 4
  insert(3, 40); Print(); // 30 20 10 40, len = 4, capacity = 4
  insert(1, 50); Print(); // 30 50 20 10 40, len = 5, capacity = 8
  insert(0, 15); Print(); // 15 30 50 20 10 40, len = 6, capacity = 8
}

std::vector

STL의 벡터(vector) 컨테이너를 사용하면 배열과 유사하지만 크기를 가변적으로 사용할 수 있다. 벡터는 가변적인 배열의 역할을 수행하기 때문에 배열을 사용하는 거의 모든 C++ 코드에서 광범위하게 사용된다.

operator= : 깊은 복사(deep copy)가 발생한다.
시간복잡도
- push_back : 원소를 끝에 추가 $O(1)$
- pop_back : 마지막 원소 제거 $O(1)$
- [], at : 임의의 위치에 원소 확인/변경: $O(1)$
- insert, erase : 임의의 위치에 원소 추가/제거 $O(n)$

C++ Skeleton Code (w/ STL)

#include <cstdio>
#include <vector>
using namespace std;

vector<int> A;
  
void Print() { 
    for(int i=0; i<A.size(); i++) printf("%d ", A[i]); 
    puts(""); 
}

int main() {
  A.push_back(10), A.push_back(20), A.push_back(30);
  
  printf("insert test\n");
  A.insert(A.begin()+3, 40); Print(); // 10 20 30 40
  A.insert(A.begin()+1, 50); Print(); // 10 50 20 30 40
  A.insert(A.begin()+0, 15); Print(); // 15 10 50 20 30 40
  
  printf("erase test\n");
  A.erase(A.begin()+4); Print(); // 15 10 50 20 40 
  A.erase(A.begin()+1); Print(); // 15 50 20 40 
  A.erase(A.begin()+3); Print(); // 15 50 20 
}

Linked list

연결 리스트(Linked List)는 노드들이 포인터로 연결되어 있는 선형 자료구조이다. 각 노드는 데이터와 다음 노드를 가리키는 포인터로 구성되어 있으며, 데이터의 동적 추가 및 삭제가 용이하다. 연결하는 방법 및 개수에 따라 단일 연결 리스트(single linked list), 이중 연결 리스트(double linked list), 원형 연결 리스트(circular linked list)로 구분할 수 있다.

배열과 달리 임의의 원소로 가기 위해서는 첫번째 원소부터 순서대로 방문해야 한다.
배열과 달리 메모리 상의 배치가 불연속적이다.
다음 원소의 주소값을 가지고 있어야 하므로 64비트(=8바이트) 컴퓨터 기준 $8N$ 바이트만큼 추가적인 메모리가 필요하다. (배열 대비 오버헤드가 있음)
C++ Skeleton Code (Pure)

#include <bits/stdc++.h>
using namespace std;

const int LM = 1000005;
int dat[LM], pre[LM], nxt[LM];
int unused = 1;

void insert(int addr, int num) {
  dat[unused] = num;
  pre[unused] = addr;
  nxt[unused] = nxt[addr];
  if(nxt[addr] != -1) pre[nxt[addr]] = unused;
  nxt[addr] = unused;
  unused++;
}

void erase(int addr){
  nxt[pre[addr]] = nxt[addr];
  if(nxt[addr] != -1) pre[nxt[addr]] = pre[addr];
}

void traverse(){
  int cur = nxt[0];
  while(cur != -1){
    printf("%d ", dat[cur]);
    cur = nxt[cur];
  }
  puts("");
}

int main(int argc, char **argv){
  fill(pre, pre+LM, -1);
  fill(nxt, nxt+LM, -1);

  printf("****** insert_test *****\n");
  insert(0, 10); // 10(address=1)
  traverse();
  insert(0, 30); // 30(address=2) 10
  traverse();
  insert(2, 40); // 30 40(address=3) 10
  traverse();
  insert(1, 20); // 30 40 10 20(address=4)
  traverse();
  insert(4, 70); // 30 40 10 20 70(address=5)
  traverse();

  printf("****** erase_test *****\n");
  erase(1); // 30 40 20 70
  traverse();
  erase(2); // 40 20 70
  traverse();
  erase(4); // 40 70
  traverse();
  erase(5); // 40
  traverse();
}

std::list

STL의 list 컨테이너는 이중 연결 리스트(double linked list)이다.

시간복잡도
- $k$번째 원소를 확인/변경 $O(k)$
- 임의의 위치에 원소를 추가/제거 $O(1)$

C++ Skeleton Code (w/ STL)

#include <cstdio>
#include <list>
using namespace std;

list<int> L;

void traverse() {
    for(auto l : L) printf("%d ", l);
    puts("");
}

int main() {    
    L = {1, 2};          // 1 2
    auto t = L.begin();  // t는 1을 가리킴
    
    L.push_front(10); traverse(); // 10 1 2
    L.push_back(5); traverse();   // 10 1 2 5
    L.insert(t, 6); traverse();   // 10 6 1 2 5
    t = L.erase(t);               // 1 제거, t가 가리키는 값은 2
    printf("%d", *t);             // 2
}

Stack

스택(Stack)은 마지막에 삽입된 요소가 가장 먼저 삭제되는 LIFO(Last In, First Out) 방식의 선형 자료구조이다. 이는 함수의 호출 스택 처리, 수식의 괄호 검사 등에 활용된다. 스택은 제일 상단이 아닌 나머지 원소들의 확인과 변경이 원칙적으로 불가능한 특징이 있다.

시간복잡도
- 원소 추가/제거: $O(1)$
- 제일 상단의 원소 확인: $O(1)$
C++ Skeleton Code (Pure)

#include <cstdio>
constexpr int LM = 1000005;

struct Stack{ 
    int stack[LM];
    int pos=0;
    
    void push(int x) { stack[pos++] = x; }
    void pop() { pos--; }
    bool empty() { return (pos==0); }
    int top() { return stack[pos-1]; }

} S;

int main() {
    S.push(5); S.push(4); S.push(3);
    printf("%d\n", S.top());  // 3
    S.pop(); S.pop();
    printf("%d\n", S.top());  // 5
    S.push(10); S.push(12); 
    printf("%d\n", S.top());  // 12
    S.pop();
    printf("%d\n", S.top());  // 10
}

C++ Skeleton Code (w/ STL)

#include <cstdio>
#include <stack>
using namespace std;

stack<int> S;

int main() {
    S.push(5); S.push(4); S.push(3);
    printf("%d\n", S.top());  // 3
    S.pop(); S.pop();
    printf("%d\n", S.top());  // 5
    S.push(10); S.push(12); 
    printf("%d\n", S.top());  // 12
    S.pop();
    printf("%d\n", S.top());  // 10
}

Queue

큐(Queue)는 처음 삽입된 요소가 가장 먼저 삭제되는 FIFO(First In, First Out) 방식의 선형 자료구조이다. 운영체제의 작업 스케줄링, 네트워크 요청 처리 등에 사용된다. 큐는 제일 앞/뒤가 아닌 나머지 원소들의 확인과 변경이 원칙적으로 불가능한 특징이 있다.

시간복잡도
- 원소 추가/제거: $O(1)$
- 제일 앞/뒤의 원소 확인: $O(1)$
C++ Skeleton Code (Pure)

#include <cstdio>
constexpr int LM = 1000005;

struct Queue{
    int fr=0, re=0;
    int que[LM];

    void push(int x) { que[re++] = x; }
    void pop() { fr++; }
    int front() { return que[fr]; }
    int back() { return que[re-1]; }
} Q;

int main() {
    Q.push(10); Q.push(20); Q.push(30);
    printf("%d %d\n", Q.front(), Q.back());  // 10 30
    Q.pop(); Q.pop();
    Q.push(15); Q.push(25);
    printf("%d %d\n", Q.front(), Q.back());  // 30 25
}

C++ Skeleton Code (w/ STL)

#include <cstdio>
#include <queue>
using namespace std;

queue<int> Q;

int main() {
    Q.push(10); Q.push(20); Q.push(30);
    printf("%d %d\n", Q.front(), Q.back());  // 10 30
    Q.pop(); Q.pop();
    Q.push(15); Q.push(25);
    printf("%d %d\n", Q.front(), Q.back());  // 30 25
}

Deque

덱(Deque)은 양쪽 끝에서 삽입과 삭제가 가능한 자료구조로, 스택과 큐의 특성을 모두 갖는다. 더 유연한 데이터 처리가 가능해 다양한 알고리즘 구현에 유용하다.

시간복잡도
- 원소 추가/제거: $O(1)$
- 제일 앞/뒤의 원소 확인: $O(1)$
제일 앞/뒤가 아닌 나머지 원소들의 확인/변경은 원칙적으로 불가능 (하지만 STL deque는 가능)
C++ Skeleton Code (Pure)

#include <cstdio>
constexpr int LM = 1000005;

int deque[2*LM+1];
int fr=LM, re=LM;

void push_front(int x){ deque[--fr] = x; }

void push_back(int x){ deque[re++] = x; }

void pop_front(){ fr++; }

void pop_back(){ re--; }

int front(){ return deque[fr]; }

int back(){ return deque[re-1]; }

int main() {    
     push_back(30);  // [30]
     printf("%d ", front()); printf("%d\n", back());  // 30 30
     push_front(25); // [25 30]
     push_back(12); printf("%d\n", back());  // 12
     push_back(62); // [25 30 12 62]
     pop_front();   // [30 12 62]
     printf("%d\n", front());  // 30
     pop_front();   // [12 62]
     printf("%d\n", back());   // 62
}

C++ Skeleton Code (w/ STL)

#include <cstdio>
#include <deque>
using namespace std;

deque<int> DQ;

int main() {    
     DQ.push_back(30);  // [30]
     printf("%d ", DQ.front()); printf("%d\n", DQ.back());  // 30 30
     DQ.push_front(25); // [25 30]
     DQ.push_back(12); printf("%d\n", DQ.back());  // 12
     DQ.push_back(62); // [25 30 12 62]
     DQ.pop_front();   // [30 12 62]
     printf("%d\n", DQ.front());  // 30
     DQ.pop_front();   // [12 62]
     printf("%d\n", DQ.back());   // 62
}

Heap

힙(Heap)은 완전 이진 트리를 기반으로 하는 자료구조로, 각 노드의 값이 자식 노드보다 크거나 같은(최대 힙) 또는 작거나 같은(최소 힙)을 만족한다. 우선순위 큐의 구현에 주로 사용된다.

C++ Skeleton Code (Pure) (Max heap)

#include <cstdio>

constexpr int NUM = 10;
constexpr int LM = 105;

int A[] = { 5,2,7,1,3,8,4,6,10,9 };

int heap[LM];
int hn = 0;

void swap(int &a, int &b) {
  int t = a;
  a = b;
  b = t;
}

void push(int nd) {
  heap[++hn] = nd;

  for (int c = hn; c > 1; c /= 2) {
    if (heap[c] > heap[c / 2]) {
      swap(heap[c], heap[c / 2]);
    } 
    else {
      break;
    }
  }
}

void pop() {
  swap(heap[1], heap[hn--]);

  for (int c = 2; c <= hn; c *= 2) {
    if (c < hn && heap[c + 1] > heap[c]) {
      c++;
    }
    if (heap[c] > heap[c / 2]) {
      swap(heap[c], heap[c / 2]);
    }
    else {
      break;
    }
  }
}

int main() {
  for(int i = 0; i < NUM; i++)  printf("%d ", A[i]);   // 5 2 7 1 3 8 4 6 10 9
  printf("\n");
  
  // max heap push
  for(int i = 0; i < NUM; i++) { push(A[i]); }

  for(int i = 1; i <= hn; i++) { printf("%d ", heap[i]);  } // 10 9 7 6 8 5 4 1 3 2
  printf("\n");

  // max heap pop
  while (hn > 1) {   pop();   }

  for(int i = 1; i <= NUM; i++) printf("%d ", heap[i]); // 1 2 3 4 5 6 7 8 9 10
  printf("\n");
}

C++ Skeleton Code (Pure) (Min heap)

#include <cstdio>

constexpr int NUM = 10;
constexpr int LM = 105;

int A[] = { 5,2,7,1,3,8,4,6,10,9 };

int heap[LM];
int hn = 0;

void swap(int &a, int &b) {
  int t = a;
  a = b;
  b = t;
}

void push(int nd) {
  heap[++hn] = nd;

  for (int c = hn; c > 1; c /= 2) {
    if (heap[c] < heap[c / 2]) { // 변경: 최소 힙 조건
      swap(heap[c], heap[c / 2]);
    }
    else {
      break;
    }
  }
}

void pop() {
  swap(heap[1], heap[hn--]);

  for (int c = 2; c <= hn; c *= 2) {
    if (c < hn && heap[c + 1] < heap[c]) { // 변경: 최소 힙 조건
      c++;
    }
    if (heap[c] < heap[c / 2]) { // 변경: 최소 힙 조건
      swap(heap[c], heap[c / 2]);
    }
    else {
      break;
    }
  }
}

int main() {
  for(int i = 0; i < NUM; i++)  printf("%d ", A[i]);   // 5 2 7 1 3 8 4 6 10 9
  printf("\n");

  // min heap push
  for(int i = 0; i < NUM; i++) { push(A[i]); }

  for(int i = 1; i <= hn; i++) { printf("%d ", heap[i]);  } // 최소 힙 구조 확인
  printf("\n");

  // min heap pop
  while (hn > 1) {   pop();   }

  for(int i = 1; i <= NUM; i++) printf("%d ", heap[i]); // 10 9 8 7 6 5 4 3 2 1
  printf("\n");
}

Priority queue

우선순위 큐(Priority Queue)는 원소들이 우선순위에 따라 정렬되며, 우선순위가 가장 높은 원소가 가장 먼저 삭제된다. 힙(Heap)을 사용하여 구현할 수 있으며, 다익스트라 같은 그래프 알고리즘에서 중요하게 사용된다.

C++ Skeleton Code (Pure) (Max heap)

#include <cstdio>

#define LM 1000

class PriorityQueue {
  private:
    int heap[LM];
    int hn;

    void swap(int &a, int &b) {
      int t = a; a = b; b = t;
    }

  public:
    PriorityQueue() : hn(0) {}

    void push(int nd) {
      heap[++hn] = nd;
      for (int c = hn; c > 1; c /= 2) {
        if (heap[c] > heap[c / 2]) {
          swap(heap[c], heap[c / 2]);
        } else {
          break;
        }
      }
    }

    void pop() {
      if (hn == 0) return;
      swap(heap[1], heap[hn--]);
      for (int c = 2; c <= hn; c *= 2) {
        if (c < hn && heap[c + 1] > heap[c]) {
          c++;
        }
        if (heap[c] > heap[c / 2]) {
          swap(heap[c], heap[c / 2]);
        } else {
          break;
        }
      }
    }

    int top() {
      return hn > 0 ? heap[1] : -1;
    }

    bool empty() {
      return hn == 0;
    }
};

int main() {
  // Max heap
  PriorityQueue pq;
  pq.push(5);   pq.push(2);   pq.push(7);  pq.push(1);   pq.push(3);   
  pq.push(8);   pq.push(4);   pq.push(6);   pq.push(10); pq.push(9);  // 5 2 7 1 3 8 4 6 10 9

  printf("%d ", pq.top());            // 10
  pq.pop();  printf("%d ", pq.top()); // 9
  pq.pop();  printf("%d ", pq.top()); // 8
  pq.pop();  printf("%d ", pq.top()); // 7
  pq.pop();  printf("%d ", pq.top()); // 6
  pq.pop();  printf("%d ", pq.top()); // 5
  pq.pop();  printf("%d ", pq.top()); // 4
  pq.pop();  printf("%d ", pq.top()); // 3
  pq.pop();  printf("%d ", pq.top()); // 2
  pq.pop();  printf("%d ", pq.top()); // 1
  printf("\n");
}

C++ Skeleton Code (w/ STL) (Max heap)

#include <cstdio>
#include <queue>

#define LM 1000

int main() {
  // Max heap
  std::priority_queue<int> pq;
  pq.push(5);   pq.push(2);   pq.push(7);  pq.push(1);   pq.push(3);   
  pq.push(8);   pq.push(4);   pq.push(6);   pq.push(10); pq.push(9);  // 5 2 7 1 3 8 4 6 10 9

  printf("%d ", pq.top());            // 10
  pq.pop();  printf("%d ", pq.top()); // 9
  pq.pop();  printf("%d ", pq.top()); // 8
  pq.pop();  printf("%d ", pq.top()); // 7
  pq.pop();  printf("%d ", pq.top()); // 6
  pq.pop();  printf("%d ", pq.top()); // 5
  pq.pop();  printf("%d ", pq.top()); // 4
  pq.pop();  printf("%d ", pq.top()); // 3
  pq.pop();  printf("%d ", pq.top()); // 2
  pq.pop();  printf("%d ", pq.top()); // 1
  printf("\n");
}

C++ Skeleton Code (w/ STL) (Min heap)

#include <cstdio>
#include <vector>
#include <queue>

#define LM 1000

int main() {
  // Min heap
  std::priority_queue<int, std::vector<int>, std::greater<int>> pq;
  pq.push(5);   pq.push(2);   pq.push(7);  pq.push(1);   pq.push(3);   
  pq.push(8);   pq.push(4);   pq.push(6);   pq.push(10); pq.push(9);  // 5 2 7 1 3 8 4 6 10 9

  printf("%d ", pq.top());            // 1
  pq.pop();  printf("%d ", pq.top()); // 2
  pq.pop();  printf("%d ", pq.top()); // 3
  pq.pop();  printf("%d ", pq.top()); // 4
  pq.pop();  printf("%d ", pq.top()); // 5
  pq.pop();  printf("%d ", pq.top()); // 6
  pq.pop();  printf("%d ", pq.top()); // 7
  pq.pop();  printf("%d ", pq.top()); // 8
  pq.pop();  printf("%d ", pq.top()); // 9
  pq.pop();  printf("%d ", pq.top()); // 10
  printf("\n");
}

Hash

해시(Hash)는 키-값 쌍을 저장하는데 사용되며, 해시 함수를 통해 키를 해시 테이블의 주소로 변환하여 데이터를 효율적으로 조회, 삽입, 삭제할 수 있다. 빠른 데이터 검색에 유리하다.

C++ Skeleton Code (Pure) (Chaining)

// Hash using chaining : https://blog.encrypted.gg/1009
#include <bits/stdc++.h>
using namespace std;

const int M = 1'000'003;
const int a = 1'000;
const int MX = 500'005; // 최대 삽입 횟수

int head[M];
int pre[MX], nxt[MX];
string key[MX];
int val[MX];
int unused = 0;

int myHash(string& s){
  int h=0;
  for(auto x : s)
    h = (h*a + x) % M;
  return h;
}

int find(string k) {
  int h = myHash(k);
  int idx = head[h];
  while(idx != -1) {
    if(key[idx] == k) return idx;
    idx = nxt[idx];
  }
  return -1;
}

void insert(string k, int v) {
  int idx = find(k);
  if(idx != -1) {
    val[idx] = v;
    return;
  }
  int h = myHash(k);
  key[unused] = k;
  val[unused] = v;
  if(head[h] != -1) {
    nxt[unused] = head[h];
    pre[head[h]] = unused;
  }
  head[h] = unused;
  unused++;
}

void erase(string k) {
  int idx = find(k);
  if(idx == -1) return;
  if(pre[idx] != -1) nxt[pre[idx]] = nxt[idx];
  if(nxt[idx] != -1) pre[nxt[idx]] = pre[idx];
  int h = myHash(k);
  if(head[h] == idx) head[h] = nxt[idx];
}

int main(int argc, char **argv){
  fill(head, head+M, -1);
  fill(pre, pre+MX, -1);
  fill(nxt, nxt+MX, -1);

  cout << "start!\n";
  insert("orange", 724);               // ("orange", 724)
  insert("melon", 20);                 // ("orange", 724), ("melon", 20)
  assert(val[find("melon")] == 20);
  insert("banana", 52);                // ("orange", 724), ("melon", 20), ("banana", 52)
  insert("cherry", 27);                // ("orange", 724), ("melon", 20), ("banana", 52), ("cherry", 27)
  insert("orange", 100);               // ("orange", 100), ("melon", 20), ("banana", 52), ("cherry", 27)
  assert(val[find("banana")] == 52);
  assert(val[find("orange")] == 100);
  erase("wrong_fruit");                // ("orange", 100), ("melon", 20), ("banana", 52), ("cherry", 27)
  erase("orange");                     // ("melon", 20), ("banana", 52), ("cherry", 27)
  assert(find("orange") == -1);
  erase("orange");                     // ("melon", 20), ("banana", 52), ("cherry", 27)
  insert("orange", 15);                // ("melon", 20), ("banana", 52), ("cherry", 27), ("orange", 15)
  assert(val[find("orange")] == 15);
  insert("apple", 36);                 // ("melon", 20), ("banana", 52), ("cherry", 27), ("orange", 15), ("apple", 36)
  insert("lemon", 6);                  // ("melon", 20), ("banana", 52), ("cherry", 27), ("orange", 15), ("apple", 36), ("lemon", 6)
  insert("orange", 701);               // ("melon", 20), ("banana", 52), ("cherry", 27), ("orange", 701), ("apple", 36), ("lemon", 6)
  assert(val[find("cherry")] == 27);
  erase("xxxxxxx");
  assert(find("xxxxxxx") == -1);
  assert(val[find("apple")] == 36);
  assert(val[find("melon")] == 20);
  assert(val[find("banana")] == 52);
  assert(val[find("cherry")] == 27);
  assert(val[find("orange")] == 701);
  assert(val[find("lemon")] == 6);
  cout << "good!\n";
}

C++ Skeleton Code (Pure) (Open addressing)

// Hash using open addressing : https://blog.encrypted.gg/1009
#include <bits/stdc++.h>
using namespace std;

const int M = 1'000'003;
const int a = 1'000;
const int EMPTY = -1;
const int OCCUPY = 0;
const int DUMMY = 1;

int status[M];   // EMPTY, OCCUPY, DUMMY
string key[M];
int val[M];

int myHash(string& s){
  int h=0;
  for(auto x : s)
    h = (h*a + x) % M;
  return h;
}

int find(string k) {
  int idx = myHash(k);
  while(status[idx] != EMPTY) {
    if(status[idx] == OCCUPY && key[idx] == k) return idx;
    idx = (idx+1) % M;
  }
  return -1;
}

void insert(string k, int v) {
  int idx = find(k);
  if(idx != -1) {
    val[idx] = v;
    return ;
  }
  idx = myHash(k);
  while(status[idx] == OCCUPY)
    idx = (idx+1) % M;

  key[idx] = k;
  val[idx] = v;
  status[idx] = OCCUPY;
}

void erase(string k) {
  int idx = find(k);
  if(idx != -1) status[idx] = DUMMY;
}

int main(int argc, char **argv){
  fill(status, status+M, -1);

  cout << "start!\n";
  insert("orange", 724);               // ("orange", 724)
  insert("melon", 20);                 // ("orange", 724), ("melon", 20)
  assert(val[find("melon")] == 20);
  insert("banana", 52);                // ("orange", 724), ("melon", 20), ("banana", 52)
  insert("cherry", 27);                // ("orange", 724), ("melon", 20), ("banana", 52), ("cherry", 27)
  insert("orange", 100);               // ("orange", 100), ("melon", 20), ("banana", 52), ("cherry", 27)
  assert(val[find("banana")] == 52);
  assert(val[find("orange")] == 100);
  erase("wrong_fruit");                // ("orange", 100), ("melon", 20), ("banana", 52), ("cherry", 27)
  erase("orange");                     // ("melon", 20), ("banana", 52), ("cherry", 27)
  assert(find("orange") == -1);
  erase("orange");                     // ("melon", 20), ("banana", 52), ("cherry", 27)
  insert("orange", 15);                // ("melon", 20), ("banana", 52), ("cherry", 27), ("orange", 15)
  assert(val[find("orange")] == 15);
  insert("apple", 36);                 // ("melon", 20), ("banana", 52), ("cherry", 27), ("orange", 15), ("apple", 36)
  insert("lemon", 6);                  // ("melon", 20), ("banana", 52), ("cherry", 27), ("orange", 15), ("apple", 36), ("lemon", 6)
  insert("orange", 701);               // ("melon", 20), ("banana", 52), ("cherry", 27), ("orange", 701), ("apple", 36), ("lemon", 6)
  assert(val[find("cherry")] == 27);
  erase("xxxxxxx");
  assert(find("xxxxxxx") == -1);
  assert(val[find("apple")] == 36);
  assert(val[find("melon")] == 20);
  assert(val[find("banana")] == 52);
  assert(val[find("cherry")] == 27);
  assert(val[find("orange")] == 701);
  assert(val[find("lemon")] == 6);
  cout << "good!\n";
}

C++ Skeleton Code (w/ STL)

// Hash using stl : https://blog.encrypted.gg/1009
#include <bits/stdc++.h>
using namespace std;

unordered_set<int> s;

int main(int argc, char **argv){
  cout << "start!\n";
  s.insert(-10); s.insert(100); s.insert(15);       // {-10, 100, 15}
  s.insert(-10);                                    // {-10, 100, 15}    
  cout << s.erase(100) << '\n';                     // 1
  cout << s.erase(20) << '\n';                      // 0
  if(s.find(15) != s.end()) cout << "15 in s\n";    // 15 in s
  else cout << "15 not in s\n";
  cout << s.size() << '\n';                         // 2
  cout << s.count(50) << '\n';                      // 0
  for(auto e : s) cout << e << ' ';                 // 15 -10
  cout << '\n';
  cout << "good!\n";
}

Tree

트리(Tree)는 계층적 관계를 표현하는 비선형 자료구조로, 노드와 노드를 연결하는 엣지로 구성된다. 이진 트리, AVL 트리, Red Black 트리 등 다양한 형태가 있으며, 계층적 데이터를 관리하는데 적합하다.

트리: 무방향이면서 사이클이 없는 연결 그래프(undirected acyclic connected graph)
- $V$개의 정점을 가지고 $V-1$개의 간선을 가지는 연결 그래프
- 연결 그래프이면서 임의의 간선을 제거하면 연결 그래프가 아니게 되는 그래프
이진 트리(Binary tree) : 각 노드의 자식이 최대 2개 이하인 트리 (레벨/전위/중위/후위 순회를 할 수 있다)
- 전위/중위/후위 순회를 하는 예제 (C++)

#include <bits/stdc++.h>
using namespace std;

struct Node{
  char left;
  char right;
};

int N;
map<char, Node> tree;

// 전위순회
void preOrder(char node){
  if(node=='.') return;
  printf("%c", node);
  preOrder(tree[node].left);
  preOrder(tree[node].right);
}

// 중위순회
void inOrder(char node){
  if(node=='.') return;
  inOrder(tree[node].left);
  printf("%c", node);
  inOrder(tree[node].right);
}

// 후위순회
void postOrder(char node){
  if(node=='.') return;
  postOrder(tree[node].left);
  postOrder(tree[node].right);
  printf("%c", node);
}

int main(int argc, char **argv){
  scanf("%d", &N); 

  for(int i=0;i<N;i++){
    char node, left, right;
    cin >> node >> left >> right;
    tree[node].left = left;
    tree[node].right = right;
  }

  preOrder('A'); printf("\n");
  inOrder('A'); printf("\n");
  postOrder('A'); printf("\n");
}

자가 균형 이진 검색 트리 : 트리가 편향되는 걸 방지하기 위해 자체적으로 루트를 변경하여 균형을 맞추는 트리 (AVL, Red Black 트리 등)

Binary search tree

이진 검색 트리(binary search tree)는 왼쪽 서브트리의 모든 값은 부모의 값보다 작고 오른쪽 서브트리의 모든 값은 부모의 값보다 큰 이진 트리(binary tree)를 말한다. 이진 검색 트리를 활용하면 대부분의 연산을 $O(\log n)$에 처리할 수 있다. 또한 원소가 크기 순으로 정렬되기 때문에 값의 대소와 관련된 성질이 필요한 경우에 유용하게 사용할 수 있다.

시간복잡도: insert, erase, find, update 모두 $O(\log n)$

std::set

STL에서 제공하는 set은 자가 균형 이진 검색 트리(self-balancing binary search tree)의 일종인 Red Black 트리 자료구조로 구현되어 있으며 '키' 만을 사용한다. 이는 요소의 중복을 허용하지 않는 모든 경우에 유용하게 사용된다.

시간복잡도: insert, erase, find, lower_bound, next, prev 모두 $O(\log n)$
C++ Skeleton Code (w/ STL)

// Set : https://blog.encrypted.gg/1013
#include <bits/stdc++.h>
using namespace std;

int main(){
  set<int> s;

  s.insert(-10); s.insert(100); s.insert(15);      // {-10, 15, 100}
  s.insert(-10);                                   // {-10, 15, 100}
  cout << s.erase(100) << '\n';                    // {-10, 15}, 1
  cout << s.erase(20) << '\n';                     // {-10, 15}, 0
  if(s.find(15) != s.end()) cout << "15 in s\n";
  else cout << "15 not in s\n";
  cout << s.size() << '\n';                        // 2
  cout << s.count(50) << '\n';                     // 0
  for(auto e : s) cout << e << ' ';
  cout << '\n';
  s.insert(-40);                                   // {-40, -10, 15}

  set<int>::iterator it1 = s.begin();              // {-40(<-it1), -10, 15}
  it1++;                                           // {-40, -10(<-it1), 15}

  auto it2 = prev(it1);                            // {-40(<-it2), -10, 15}
  it2 = next(it1);                                 // {-40, -10, 15(<-it2)}
  advance(it2, -2);                                // {-40(<-it2), -10, 15}

  auto it3 = s.lower_bound(-20);                   // {-40, -10(<-it3), 15}

  auto it4 = s.find(15);                           // {-40, -10, 15(<-it4)}

  cout << *it1 << '\n';                            // -10
  cout << *it2 << '\n';                            // -40
  cout << *it3 << '\n';                            // -10
  cout << *it4 << '\n';                            // 15
}

std::map

STL에서 제공하는 map은 자가 균형 이진 검색 트리(self-balancing binary search tree)의 일종인 Red Black 트리 자료구조로 구현되어 있으며 '키-값' 쌍을 사용한다. 각 키는 유일하며 이를 통해 값을 조회할 수 있다. 데이터의 효율적인 저장과 검색에 쓰인다.

시간복잡도: insert, erase, find, lower_bound, next, prev 모두 $O(\log n)$
C++ Skeleton Code (w/ STL)

// Map : https://blog.encrypted.gg/1013
#include <bits/stdc++.h>
using namespace std;

int main(){
  map<string, int> m;

  m["hi"] = 123;
  m["bkd"] = 1000;
  m["gogo"] = 165;                                    // ("bkd", 1000), ("gogo", 165), ("hi", 123)
  cout << m.size() << '\n';                           // 3
  m["hi"] = -7;                                       // ("bkd", 1000), ("gogo", 165), ("hi", -7)

  if(m.find("hi") != m.end()) cout << "hi in m\n";
  else cout << "hi not in m\n";

  m.erase("bkd");                                     // ("gogo", 165), ("hi", 123)

  for(auto e : m)
    cout << e.first << ' ' << e.second << '\n';

  auto it1 = m.find("gogo");
  cout << it1->first << ' ' << it1->second << '\n';   // gogo 165
}

Segment tree

구간 트리(segment tree)는 배열 같은 선형 자료구조에 대해 구간 합이나 구간 최댓값·최솟값 등의 쿼리를 빠르게 처리하기 위해 트리 형태로 분할·구성하는 자료구조이다. 배열을 여러 구간으로 나누어 각 구간의 대표 정보를 저장하며, 쿼리가 들어오면 관련된 구간들만 빠르게 확인하여 결과를 합산하거나 갱신한다. 이를 통해 원소를 일일이 확인하지 않고도 구간 정보를 효율적으로 처리할 수 있으며, 쿼리와 업데이트 모두 일반적으로 $O(\log n)$에 처리할 수 있다.

C++ Skeleton Code (Pure)

// JUNGOL 1726 구간의 최대값1
// jungol.co.kr/problem/1726
// 세그먼트 트리 : 업데이트 및 쿼리
#include <cstdio>

const int LM = 50005;
const int TLM = 1 << 17; // 131072

int N, Q;
int tree[TLM];

int max(int a, int b) { return a > b ? a : b; }

// 세그먼트 트리 업데이트 함수
void update(int node, int s, int e, int tg, int val) {
	if (s >= e) { // 리프 노드에 도달했을 때 값 업데이트
		tree[node] = val;
		return;
	}
	int m = (s + e) / 2, lch = node * 2, rch = lch + 1;
	if (tg <= m) update(lch, s, m, tg, val);  // 왼쪽 자식 노드 업데이트
	else update(rch, m + 1, e, tg, val);      // 오른쪽 자식 노드 업데이트
	tree[node] = max(tree[lch], tree[rch]);   // 부모 노드는 자식 노드 중 최댓값을 저장
}

// 구간 내 최댓값을 찾는 쿼리 함수
int query(int node, int s, int e, int qs, int qe) {
	if (e < qs || qe < s) return -1;           // 쿼리 범위 밖이면 무효값 반환
	if (qs <= s && e <= qe) return tree[node]; // 현재 구간이 완전히 쿼리 범위에 포함되면 반환
	int m = (s + e) / 2, lch = node * 2, rch = lch + 1;
	int leftMax = query(lch, s, m, qs, qe);    // 왼쪽 서브트리 탐색
	int rightMax = query(rch, m + 1, e, qs, qe); // 오른쪽 서브트리 탐색
	return max(leftMax, rightMax);             // 두 값 중 최댓값 반환
}

int main() {
	scanf("%d %d", &N, &Q);
	int i, s, e, val;

	// 초기 데이터 입력 및 트리 업데이트
	for (i = 1; i <= N; ++i) {
		scanf("%d", &val);
		update(1, 1, N, i, val);
	}

	// 쿼리 처리
	for (i = 0; i < Q; ++i) {
		scanf("%d %d", &s, &e);
		printf("%d\n", query(1, 1, N, s, e));
	}
}

C++ Skeleton Code (Pure, 구조체 버전)

// JUNGOL 1726 구간의 최대값1
// jungol.co.kr/problem/1726
#include <bits/stdc++.h>
#define FASTIO cin.tie(0); cout.tie(0); ios_base::sync_with_stdio(0);
using namespace std;

int N, Q;

struct SegTree{
  int n;                 // 배열의 길이
  vector<int> tree;      // 각 구간의 최소치
  
  SegTree(const vector<int>& arr) {
    n = arr.size();
    tree.resize(n*4);
    init(arr, 0, n-1, 1);
  }
  // node가 arr[l ... r] 배열을 표현할 때 node를 루트로 하는 서브트리를 초기화하고 이 구간의 최소치를 반환한다
  int init(const vector<int>& arr, int s, int e, int node){
    if(s==e) return tree[node] = arr[s];
    
    int m = (s+e)/2;
    int smin = init(arr, s, m, node*2);
    int emin = init(arr, m+1, e, node*2+1);
    return tree[node] = max(smin, emin);
  }
  
  int query(int s, int e, int node, int qs, int qe) {
    if(e < qs || qe < s) return INT_MIN;
    if(qs >= s && qe <= e) return tree[node];
    
    int m = (qs+qe)/2;
    return max( query(s, e, node*2, qs, m), 
                query(s, e, node*2+1, m+1, qe) );
  }
  
  int query(int s, int e) { return query(s, e, 1, 0, n-1); }
  
  int update(int idx, int nval, int node, int qs, int qe) {
    // idx가 노드가 표현하는 구간과 상관없는 경우엔 무시한다
    if(idx < qs || qe < idx) return tree[node];
    if(qs == qe) return tree[node] = nval;
    
    int m = (qs+qe)/2;
    return tree[node] = max(update(idx, nval, node*2, qs, m),
                            update(idx, nval, node*2+1, m+1, qe) );
  }
  
  int update(int idx, int nval) { return update(idx, nval, 1, 0, n-1); }
};   

int main(int argc, char **argv){
  FASTIO;
  cin >> N >> Q;

  vector<int> arr(N);
  for(int i = 0; i < N; i++) {
      cin >> arr[i];
  }

  SegTree seg(arr);

  while (Q--) {
      int s, e; cin >> s >> e;
      int ans = seg.query(s - 1, e - 1);
      cout << ans << '\n';
  }
}

Fenwick tree

펜윅 트리(Fenwick tree), 또는 바이너리 인덱스 트리(Binary Indexed Tree)는 선형적인 배열 구조에서 특정 위치의 값 갱신과 구간의 누적 합(prefix sum) 등의 쿼리를 효율적으로 처리하기 위한 자료구조이다. 배열을 인덱스의 비트 단위로 적절히 나누어, 각 구간에 대한 부분적인 합 정보를 트리 형태로 압축하여 저장한다. 쿼리가 발생하면 이 비트 연산을 이용해 필요한 구간들만 효율적으로 탐색하고 결과를 빠르게 누적하거나 갱신한다. 이를 통해 단순 배열 탐색 없이도 빠르게 구간 정보를 처리할 수 있으며, 각 원소 업데이트와 누적합 쿼리를 모두 일반적으로 $O(\log N)$ 시간에 처리할 수 있다.

세그먼트 트리(Segment tree)보다 구현이 훨씬 간단하고 메모리 사용량도 적으며 상수 시간이 작아 더 빠르게 동작하는 경우가 많다.
하지만 세그먼트 트리가 다양한 연산(구간 합, 최대/최소, GCD 등)에 쉽게 확장 가능한 반면, 펜윅 트리는 주로 구간 합과 빈도수 관리 등 비교적 단순한 연산에 특화되어 있다. 펜윅 트리는 기본적으로 누적합 구조(prefix sum)를 기반으로 하므로 임의의 복잡한 연산을 처리하려면 추가적인 변형이나 테크닉이 필요하다.
C++ Skeleton Code (Pure, 구조체 버전)

#include <bits/stdc++.h>
using namespace std;

struct Fenwick {
  vector<int> tree;
  Fenwick(int n) : tree(n+1) {}

  int sum(int pos){
    ++pos;
    int ret=0;
    while(pos>0){
      ret += tree[pos];
      pos &= (pos-1);
    }
    return ret;
  }

  void add(int pos, int val){
    ++pos;
    while(pos < tree.size()){
      tree[pos] += val;
      pos += (pos & -pos);
    }
  }

  int range_sum(int s, int e) {
    return sum(e) - sum(s-1);
  }
};

int main() {
  int N = 8;                                   // 원소 8개
  vector<int> A = {0, 3, 2, 4, 1, 6, 5, 7, 2}; // 1-based dummy 0 포함
          // idx:     1  2  3  4  5  6  7  8

  Fenwick ft(N);

  // build
  for(int i=1; i<=N; i++) ft.add(i, A[i]);

  printf("%d\n", ft.range_sum(2,5));    // 2~5 합 = 2+4+1+6 = 13
  ft.add(4,3);                          // A[4] += 3 (4→7)
  printf("%d\n", ft.range_sum(2,5));    // 2~5 합 = 16 
}

Graph

그래프(Graph)는 노드들과 이를 연결하는 엣지들로 구성된 자료구조이다. 방향성, 가중치 유무에 따라 다양한 종류가 있으며, 네트워크 구조, 경로 찾기 등에 사용된다.

차수(degree) : 각 노드에 대해 엣지로 연결된 이웃한 노드의 개수
무방향 그래프(undirected graph) : 엣지의 방향이 없는 그래프 (degree 존재)
방향 그래프(directed graph): 엣지의 방향이 있는 그래프 (indegree, outdegree 존재)
사이클 : 임의의 한 점에서 출발해 자기 자신으로 돌아올 수 있는 경로
순환 그래프(cyclic graph) : 사이클이 하나라도 있는 그래프
비순환 그래프(acyclic graph) : 사이클이 하나도 없는 그래프
완전 그래프(complete graph) : 모든 서로 다른 두 노드 쌍이 엣지로 연결된 그래프
연결 그래프(connected graph) : 임의의 두 노드 사이에 경로가 항상 존재하는 그래프
루프 : 한 노드에서 시작해 같은 노드로 들어오는 엣지
C++ Skeleton Code (Pure) (인접 행렬)

#include <bits/stdc++.h>
using namespace std;

int adj[10][10] = {};

int main(int argc, char **argv){
  int v,e;
  scanf("%d %d", &v, &e);
  for(int i=0; i<e; i++){
    int u,v;
    scanf("%d %d", &u, &v);
    adj[u][v] = 1;
    adj[v][u] = 1;    // 무방향 그래프 (방향 그래프이면 12번 라인만 사용)
  }
}

C++ Skeleton Code (w/ STL) (인접 리스트)

#include <bits/stdc++.h>
using namespace std;

int edge[10][2];
int deg[10];     // 각 정점의 outdegree
int* adj[10];
int idx[10];     // adj[i]에서 새로운 정점을 추가할 때의 위치

int main(int argc, char **argv){
  int v,e;
  scanf("%d %d", &v, &e);
  for(int i=0; i<e; i++){
    scanf("%d %d", &edge[i][0], &edge[i][1]);
    deg[edge[i][0]]++;
    deg[edge[i][1]]++;   // 무방향 그래프 (방향 그래프인 경우 14번 라인만 사용)
  }

  for(int i=1; i<=v; i++)
    adj[i] = new int[deg[i]];

  for(int i=0; i<e; i++){
    int u = edge[i][0], v = edge[i][1];
    adj[u][idx[u]] = v;
    idx[u]++;
  }
}

C++ Skeleton Code (w/ STL) (인접 리스트)

#include <bits/stdc++.h>
using namespace std;

vector<int> adj[10];

int main(int argc, char **argv){
  int v,e;
  scanf("%d %d", &v, &e);
  for(int i=0; i<e; i++){
    int u,v;
    scanf("%d %d", &u, &v);
    adj[u].push_back(v);
    adj[v].push_back(u);    // 무방향 그래프 (방향 그래프이면 12번 라인만 사용)
  }
}

공간복잡도: 인접 행렬 $O(V^2)$, 인접 리스트 $O(V+E)$
인접 행렬 구현은 두 점의 연결여부를 자주 확인하거나 $E$가 $V^2$와 가까울 때 사용하면 효율적
인접 리스트 구현은 특정 정점에 연결된 모든 정점을 자주 확인하거나 $E$가 $V^2$보다 훨씬 작을 때 효율적이다 (일반적으로 인접 리스트가 필요한 상황이 PS에서 자주 나옴)

Union-find

유니온 파인드(union find) 또는 서로소 집합(disjoint set union, DSU)는 여러 개의 원소를 효율적으로 그룹화하고 관리하는 자료구조로, "find" 연산을 통해 특정 원소가 속한 집합의 대표를 찾고, "Union" 연산을 통해 두 집합을 합칠 수 있다. 경로 압축(path compression)과 랭크 기반 합치기(union by rank) 같은 최적화 기법을 적용하면 평균적으로 매우 빠르게 동작하며, 그래프의 연결 요소 찾기, 최소 스패닝 트리(MST) 등 다양한 문제에서 활용된다.

시간복잡도: 최적화 적용 시 $O(\alpha(n))$
- $\alpha(n)$ : 아커만 함수, $\alpha(10^{80})=5$ 수준이므로 사실 상 상수 시간복잡도라고 보면 된다
C++ Skeleton Code (Pure) #1

// 종교 jungol.co.kr/problem/1863
#include <bits/stdc++.h>
using namespace std;

constexpr int LM = 50005;
int N, M, ret;
int p[LM], rk[LM];

int find(int x) {
  if (p[x] == x) return x;
  return p[x] = find(p[x]);   // 최적화1 : 경로 압축 적용
}

void swap(int& a, int& b) { int c=a; a=b; b=c;}

void Union( int a, int b ) {
  a = find( a ), b = find( b );
  if ( a == b ) return;
  ret--;

  // 최적화2 : 랭크 기반 합치기
  if ( rk[a] < rk[b] ) swap( a, b );
  p[b] = a;
  if ( rk[a] == rk[b] ) rk[a]++;
}

int main() {
  scanf( "%d %d", &N, &M );
  ret = N;

	for(int i=0; i<N; i++){
		p[i] = i;
		rk[i] = 0;  // 랭크 초기화
	}

  for ( int i = 0; i < M; i++ ) {
    int a, b;
    scanf( "%d %d", &a, &b );
    Union( a, b );
  }
  printf( "%d", ret );
}

C++ Skeleton Code (Pure) #2

// 종교 jungol.co.kr/problem/1863
#include <bits/stdc++.h>
using namespace std;

constexpr int LM = 50005;
int N, M, ret;
vector<int> p(LM, -1);

int find(int x) {
  if (p[x] < 0) return x;
  return p[x] = find(p[x]);   // 최적화1 : 경로 압축 적용
}

void swap(int& a, int& b) { int c=a; a=b; b=c;}

void Union( int a, int b ) {
  a = find( a ), b = find( b );
  if ( a == b ) return;
  ret--;

  // 최적화2 : 랭크 기반 합치기, rank, parent를 하나의 변수에서 처리하는 버전
  if ( p[a] < p[b] ) swap( a, b );
  if ( p[a] == p[b] ) p[a]--;
  p[b] = a;
}

int main() {
  scanf( "%d %d", &N, &M );
  ret = N;

  for ( int i = 0; i < M; i++ ) {
    int a, b;
    scanf( "%d %d", &a, &b );
    Union( a, b );
  }
  printf( "%d", ret );
}

Trie

Trie(트라이)는 문자열을 효율적으로 저장하고 검색하는 트리형 자료구조로, 각 노드는 문자열의 접두사를 공유하며 계층적으로 구성된다. "삽입(insert)" 연산을 통해 문자를 추가하고, "탐색(search)" 연산을 통해 특정 문자열이 존재하는지 빠르게 확인할 수 있다. 노드에 값을 저장하거나 마킹하는 방식으로 단어의 끝을 나타내며, 해시맵 기반 구현 또는 배열을 활용한 최적화 기법을 적용하면 $O(|L|)$의 시간 복잡도로 동작한다. Trie는 자동완성, 사전(dictionary), 문자열 검색 등 다양한 문제에서 활용된다.

$|L|$ : 문자열 $L$의 길이
시간복잡도: insert, erase, find 모두 $O(|L|)$
문자열을 그냥 배열에 저장하는 것보다 최대 '4 x 글자의 종류' 배만큼 메모리를 더 사용한다 (e.g., 각 단어가 알파벳 대문자로만 구성되어 있을 경우 104배)
이론적인 시간복잡도와는 별개로 실제로는 트라이가 해시, 이진검색트리에 비해 훨씬 느리며 공간복잡도도 훨씬 크다. 일반적인 문자열 문제는 해시, 이진검색트리를 사용하면 되지만 자동완성 같은 특수한 활용에서는 트라이가 필요하다
C++ Skeleton Code (Pure)

// 문자열 집합 boj.kr/14425
#include <bits/stdc++.h>
using namespace std;

const int LM = 500 * 10000 + 5;  // 최대 등장 가능한 글자의 수 (길이 500인 문자열 10000개)
const int ROOT = 1;

int N, M;
int unused = 2;
bool chk[LM];     // 해당 정점이 문자열의 끝인지 여부 저장
int nxt[LM][26];  // 각 정점에서 자식의 정점 번호
                  // 배열에 문자열 저장 : char 1칸 (1바이트)
                  // Trie에 문자열 저장 : int 26칸 (4x26바이트)

int c2i(char c) { 	return c-'a'; }

// Trie
void insert( string& s ) {
  int cur = ROOT;
  for ( auto c : s ) {
    int i = c2i( c );
    if ( nxt[cur][i] == -1 )
      nxt[cur][i] = unused++;
    cur = nxt[cur][i];
  }
  chk[cur] = true;
}

bool find( string& s ) {
  int cur = ROOT;
  for ( auto c : s ) {
    int i = c2i( c );
    if ( nxt[cur][i] == -1 ) return false;
    cur = nxt[cur][i];
  }
  return chk[cur];
}

void erase( string& s ) {
  int cur = ROOT;
  for ( auto c : s ) {
    int i = c2i( c );
    if ( nxt[cur][i] == -1 ) return;
    cur = nxt[cur][i];
  }
  chk[cur] = false;
}

int main( int argc, char** argv ) {
  for ( int i = 0; i < LM; i++ ) {
    fill( nxt[i], nxt[i] + 26, -1 );
  }

  scanf( "%d%d", &N, &M );

  while ( N-- ) {
    string s;
    cin >> s;
    insert( s );
  }

  int ans = 0;
  while ( M-- ) {
    string s;
    cin >> s;
    ans += find( s );
  }
  printf( "%d\n", ans );
}

C++ Skeleton Code (Pure, 구조체 버전)

// 문자열 집합 boj.kr/14425
#include <bits/stdc++.h>
using namespace std;

const int LM = 500*10000+5;
int N,M;

int c2i(char c) { return c-'a'; }

struct TrieNode {
  TrieNode* child[26];
  bool terminal;

  TrieNode() :terminal(false){
    memset(child, 0, sizeof(child));
  }
  ~TrieNode(){
    for(int i=0; i<26; i++){
      if(child[i]) delete child[i];
    }
  }

  void insert(const char* key){
    if(*key==0) terminal = true;
    else {
      int next = c2i(*key);
      if(child[next] == NULL) child[next] = new TrieNode();
      child[next]->insert(key+1);
    }
  }

  TrieNode* find(const char* key){
    if(*key==0) return this;
    int next = c2i(*key);
    if(child[next] == NULL) return NULL;
    return child[next]->find(key+1);
  }
};


int main(int argc, char **argv){
  //freopen("s_in_1345.txt", "r", stdin);

  scanf("%d%d",&N,&M);

  TrieNode trie;

  while(N--){
    string s; cin >> s;
    trie.insert(s.c_str());
  }

  int ans = 0;

  while(M--) {
    string s; cin >> s;
    TrieNode* t = trie.find(s.c_str());
    if(t != nullptr && t->terminal) ans++;
  }

  printf("%d\n", ans);
}

References

[1] (Blog) 알고리즘 문제풀이(PS) 시작하기 - plzrun

[2] (Blog) 내가 문제풀이를 연습하는 방법 - koosaga

[3] (Blog) 코드포스 블루, 퍼플 달성 후기 및 일지 & 공부법 - Rebro

[4] (Blog) PS를 공부하는 방법 (How to study Problem Solving?) - subinium

[5] (Site) Baejoon Online Judge

[6] (Site) JungOl

[7] (Site) SW Expert Academy

[8] (Blog) BaaaaaaaarkingDog의 강좌/실전 알고리즘

[9] (Youtube) BaaaaaaaarkingDog의 강좌/실전 알고리즘

[10] (Site) C++ 유용한 기능들

[11] (Book) 알고리즘 문제 해결 전략 - 구종만 저

저작자표시 (새창열림)

'Coding' 카테고리의 다른 글

[PS] 알고리즘 문제 풀이 - 유용한 팁 및 문제 리스트 정리 (0)	2024.03.08
[PS] 자료구조와 알고리즘 개념 정리 (Algorithm) Part 2 (1)	2024.03.08
VPython 예제10 - 2자유도 반한정계 스프링-질량 애니메이션 (0)	2022.01.11
VPython 예제9 - 2자유도 스프링-질량계 애니메이션 (2) (0)	2022.01.11
VPython 예제8 - 2자유도 스프링-질량계 애니메이션 (0)	2022.01.11

A L I D A

[PS] 자료구조와 알고리즘 개념 정리 (Data Structure) Part 1

내용을 보강하여 [PS] 알고리즘, 자료구조, 풀이팁, 문제리스트 포스트에 업로드하였습니다.

Data structure

Array

std::vector

Linked list

std::list

Stack

Queue

Deque

Heap

Priority queue

Hash

Tree

Binary search tree

std::set

std::map

Segment tree

Fenwick tree

Graph

Union-find

Trie

References

'Coding' 카테고리의 다른 글

티스토리툴바

[PS] 자료구조와 알고리즘 개념 정리 (Data Structure) Part 1

내용을 보강하여 [PS] 알고리즘, 자료구조, 풀이팁, 문제리스트 포스트에 업로드하였습니다.

Data structure

Array

std::vector

Linked list

std::list

Stack

Queue

Deque

Heap

Priority queue

Hash

Tree

Binary search tree

std::set

std::map

Segment tree

Fenwick tree

Graph

Union-find

Trie

References

'Coding' 카테고리의 다른 글

'Coding' Related Articles

티스토리툴바