Algorithms Notes for Professionals

🔙 Quay lại trang tải sách pdf ebook Algorithms Notes for Professionals Ebooks Nhóm Zalo Notes for Professionals Algorithms Algorithms Notes for Professionals 200+ pages of professional hints and tricks GoalKicker.com Free Programming Books Disclaimer This is an unocial free book created for educational purposes and is not aliated with ocial Algorithms group(s) or company(s). All trademarks and registered trademarks are the property of their respective owners Contents About ................................................................................................................................................................................... 1 Chapter 1: Getting started with algorithms .................................................................................................... 2 Section 1.1: A sample algorithmic problem ................................................................................................................. 2 Section 1.2: Getting Started with Simple Fizz Buzz Algorithm in Swift ...................................................................... 2 Chapter 2: Algorithm Complexity ......................................................................................................................... 5 Section 2.1: Big-Theta notation .................................................................................................................................... 5 Section 2.2: Comparison of the asymptotic notations .............................................................................................. 6 Section 2.3: Big-Omega Notation ................................................................................................................................ 6 Chapter 3: Big-O Notation ........................................................................................................................................ 8 Section 3.1: A Simple Loop ............................................................................................................................................ 9 Section 3.2: A Nested Loop ........................................................................................................................................... 9 Section 3.3: O(log n) types of Algorithms ................................................................................................................. 10 Section 3.4: An O(log n) example .............................................................................................................................. 12 Chapter 4: Trees ......................................................................................................................................................... 14 Section 4.1: Typical anary tree representation ......................................................................................................... 14 Section 4.2: Introduction ............................................................................................................................................. 14 Section 4.3: To check if two Binary trees are same or not ..................................................................................... 15 Chapter 5: Binary Search Trees .......................................................................................................................... 18 Section 5.1: Binary Search Tree - Insertion (Python) ............................................................................................... 18 Section 5.2: Binary Search Tree - Deletion(C++) ..................................................................................................... 20 Section 5.3: Lowest common ancestor in a BST ...................................................................................................... 21 Section 5.4: Binary Search Tree - Python ................................................................................................................. 22 Chapter 6: Check if a tree is BST or not .......................................................................................................... 24 Section 6.1: Algorithm to check if a given binary tree is BST .................................................................................. 24 Section 6.2: If a given input tree follows Binary search tree property or not ....................................................... 25 Chapter 7: Binary Tree traversals ..................................................................................................................... 26 Section 7.1: Level Order traversal - Implementation ............................................................................................... 26 Section 7.2: Pre-order, Inorder and Post Order traversal of a Binary Tree .......................................................... 27 Chapter 8: Lowest common ancestor of a Binary Tree ......................................................................... 29 Section 8.1: Finding lowest common ancestor ......................................................................................................... 29 Chapter 9: Graph ......................................................................................................................................................... 30 Section 9.1: Storing Graphs (Adjacency Matrix) ....................................................................................................... 30 Section 9.2: Introduction To Graph Theory .............................................................................................................. 33 Section 9.3: Storing Graphs (Adjacency List) ........................................................................................................... 37 Section 9.4: Topological Sort ..................................................................................................................................... 39 Section 9.5: Detecting a cycle in a directed graph using Depth First Traversal .................................................. 40 Section 9.6: Thorup's algorithm ................................................................................................................................. 41 Chapter 10: Graph Traversals .............................................................................................................................. 43 Section 10.1: Depth First Search traversal function .................................................................................................. 43 Chapter 11: Dijkstra’s Algorithm .......................................................................................................................... 44 Section 11.1: Dijkstra's Shortest Path Algorithm ........................................................................................................ 44 Chapter 12: A* Pathfinding ..................................................................................................................................... 49 Section 12.1: Introduction to A* ................................................................................................................................... 49 Section 12.2: A* Pathfinding through a maze with no obstacles ............................................................................. 49 Section 12.3: Solving 8-puzzle problem using A* algorithm .................................................................................... 56 Chapter 13: A* Pathfinding Algorithm ............................................................................................................... 59 Section 13.1: Simple Example of A* Pathfinding: A maze with no obstacles .......................................................... 59 Chapter 14: Dynamic Programming ................................................................................................................. 66 Section 14.1: Edit Distance ........................................................................................................................................... 66 Section 14.2: Weighted Job Scheduling Algorithm .................................................................................................. 66 Section 14.3: Longest Common Subsequence .......................................................................................................... 70 Section 14.4: Fibonacci Number ................................................................................................................................. 71 Section 14.5: Longest Common Substring ................................................................................................................ 72 Chapter 15: Applications of Dynamic Programming ................................................................................ 73 Section 15.1: Fibonacci Numbers ................................................................................................................................ 73 Chapter 16: Kruskal's Algorithm .......................................................................................................................... 76 Section 16.1: Optimal, disjoint-set based implementation ....................................................................................... 76 Section 16.2: Simple, more detailed implementation ............................................................................................... 77 Section 16.3: Simple, disjoint-set based implementation ......................................................................................... 77 Section 16.4: Simple, high level implementation ....................................................................................................... 77 Chapter 17: Greedy Algorithms ............................................................................................................................ 79 Section 17.1: Human Coding ..................................................................................................................................... 79 Section 17.2: Activity Selection Problem .................................................................................................................... 82 Section 17.3: Change-making problem ..................................................................................................................... 84 Chapter 18: Applications of Greedy technique ............................................................................................ 86 Section 18.1: Oine Caching ....................................................................................................................................... 86 Section 18.2: Ticket automat ...................................................................................................................................... 94 Section 18.3: Interval Scheduling ................................................................................................................................ 97 Section 18.4: Minimizing Lateness ............................................................................................................................ 101 Chapter 19: Prim's Algorithm .............................................................................................................................. 105 Section 19.1: Introduction To Prim's Algorithm ....................................................................................................... 105 Chapter 20: Bellman–Ford Algorithm ............................................................................................................ 113 Section 20.1: Single Source Shortest Path Algorithm (Given there is a negative cycle in a graph) ................. 113 Section 20.2: Detecting Negative Cycle in a Graph ............................................................................................... 116 Section 20.3: Why do we need to relax all the edges at most (V-1) times .......................................................... 118 Chapter 21: Line Algorithm ................................................................................................................................... 121 Section 21.1: Bresenham Line Drawing Algorithm .................................................................................................. 121 Chapter 22: Floyd-Warshall Algorithm .......................................................................................................... 124 Section 22.1: All Pair Shortest Path Algorithm ........................................................................................................ 124 Chapter 23: Catalan Number Algorithm ....................................................................................................... 127 Section 23.1: Catalan Number Algorithm Basic Information ................................................................................ 127 Chapter 24: Multithreaded Algorithms ......................................................................................................... 129 Section 24.1: Square matrix multiplication multithread ......................................................................................... 129 Section 24.2: Multiplication matrix vector multithread .......................................................................................... 129 Section 24.3: merge-sort multithread ..................................................................................................................... 129 Chapter 25: Knuth Morris Pratt (KMP) Algorithm ..................................................................................... 131 Section 25.1: KMP-Example ...................................................................................................................................... 131 Chapter 26: Edit Distance Dynamic Algorithm .......................................................................................... 133 Section 26.1: Minimum Edits required to convert string 1 to string 2 ................................................................... 133 Chapter 27: Online algorithms ........................................................................................................................... 136 Section 27.1: Paging (Online Caching) .................................................................................................................... 137 Chapter 28: Sorting ................................................................................................................................................. 143 Section 28.1: Stability in Sorting ............................................................................................................................... 143 Chapter 29: Bubble Sort ........................................................................................................................................ 144 Section 29.1: Bubble Sort .......................................................................................................................................... 144 Section 29.2: Implementation in C & C++ ............................................................................................................... 144 Section 29.3: Implementation in C# ........................................................................................................................ 145 Section 29.4: Python Implementation ..................................................................................................................... 146 Section 29.5: Implementation in Java ..................................................................................................................... 147 Section 29.6: Implementation in Javascript ........................................................................................................... 147 Chapter 30: Merge Sort ......................................................................................................................................... 149 Section 30.1: Merge Sort Basics ............................................................................................................................... 149 Section 30.2: Merge Sort Implementation in Go .................................................................................................... 150 Section 30.3: Merge Sort Implementation in C & C# ............................................................................................. 150 Section 30.4: Merge Sort Implementation in Java ................................................................................................ 152 Section 30.5: Merge Sort Implementation in Python ............................................................................................. 153 Section 30.6: Bottoms-up Java Implementation ................................................................................................... 154 Chapter 31: Insertion Sort ..................................................................................................................................... 156 Section 31.1: Haskell Implementation ....................................................................................................................... 156 Chapter 32: Bucket Sort ........................................................................................................................................ 157 Section 32.1: C# Implementation ............................................................................................................................. 157 Chapter 33: Quicksort ............................................................................................................................................. 158 Section 33.1: Quicksort Basics .................................................................................................................................. 158 Section 33.2: Quicksort in Python ............................................................................................................................ 160 Section 33.3: Lomuto partition java implementation ............................................................................................. 160 Chapter 34: Counting Sort ................................................................................................................................... 162 Section 34.1: Counting Sort Basic Information ....................................................................................................... 162 Section 34.2: Psuedocode Implementation ............................................................................................................ 162 Chapter 35: Heap Sort ........................................................................................................................................... 164 Section 35.1: C# Implementation ............................................................................................................................. 164 Section 35.2: Heap Sort Basic Information ............................................................................................................. 164 Chapter 36: Cycle Sort ........................................................................................................................................... 166 Section 36.1: Pseudocode Implementation ............................................................................................................. 166 Chapter 37: Odd-Even Sort .................................................................................................................................. 167 Section 37.1: Odd-Even Sort Basic Information ...................................................................................................... 167 Chapter 38: Selection Sort ................................................................................................................................... 170 Section 38.1: Elixir Implementation .......................................................................................................................... 170 Section 38.2: Selection Sort Basic Information ...................................................................................................... 170 Section 38.3: Implementation of Selection sort in C# ............................................................................................ 172 Chapter 39: Searching ............................................................................................................................................ 174 Section 39.1: Binary Search ...................................................................................................................................... 174 Section 39.2: Rabin Karp .......................................................................................................................................... 175 Section 39.3: Analysis of Linear search (Worst, Average and Best Cases) ........................................................ 176 Section 39.4: Binary Search: On Sorted Numbers ................................................................................................. 178 Section 39.5: Linear search ...................................................................................................................................... 178 Chapter 40: Substring Search ........................................................................................................................... 180 Section 40.1: Introduction To Knuth-Morris-Pratt (KMP) Algorithm ..................................................................... 180 Section 40.2: Introduction to Rabin-Karp Algorithm ............................................................................................. 183 Section 40.3: Python Implementation of KMP algorithm ...................................................................................... 186 Section 40.4: KMP Algorithm in C ............................................................................................................................ 187 Chapter 41: Breadth-First Search .................................................................................................................... 190 Section 41.1: Finding the Shortest Path from Source to other Nodes .................................................................. 190 Section 41.2: Finding Shortest Path from Source in a 2D graph .......................................................................... 196 Section 41.3: Connected Components Of Undirected Graph Using BFS ............................................................. 197 Chapter 42: Depth First Search ........................................................................................................................ 202 Section 42.1: Introduction To Depth-First Search ................................................................................................... 202 Chapter 43: Hash Functions ................................................................................................................................ 207 Section 43.1: Hash codes for common types in C# ............................................................................................... 207 Section 43.2: Introduction to hash functions .......................................................................................................... 208 Chapter 44: Travelling Salesman .................................................................................................................... 210 Section 44.1: Brute Force Algorithm ........................................................................................................................ 210 Section 44.2: Dynamic Programming Algorithm ................................................................................................... 210 Chapter 45: Knapsack Problem ........................................................................................................................ 212 Section 45.1: Knapsack Problem Basics .................................................................................................................. 212 Section 45.2: Solution Implemented in C# .............................................................................................................. 212 Chapter 46: Equation Solving ............................................................................................................................ 214 Section 46.1: Linear Equation ................................................................................................................................... 214 Section 46.2: Non-Linear Equation .......................................................................................................................... 216 Chapter 47: Longest Common Subsequence ............................................................................................ 220 Section 47.1: Longest Common Subsequence Explanation .................................................................................. 220 Chapter 48: Longest Increasing Subsequence ......................................................................................... 225 Section 48.1: Longest Increasing Subsequence Basic Information ...................................................................... 225 Chapter 49: Check two strings are anagrams .......................................................................................... 228 Section 49.1: Sample input and output .................................................................................................................... 228 Section 49.2: Generic Code for Anagrams ............................................................................................................. 229 Chapter 50: Pascal's Triangle ............................................................................................................................ 231 Section 50.1: Pascal triangle in C ............................................................................................................................. 231 Chapter 51: Algo:- Print a m*n matrix in square wise ............................................................................. 232 Section 51.1: Sample Example .................................................................................................................................. 232 Section 51.2: Write the generic code ....................................................................................................................... 232 Chapter 52: Matrix Exponentiation .................................................................................................................. 233 Section 52.1: Matrix Exponentiation to Solve Example Problems ......................................................................... 233 Chapter 53: polynomial-time bounded algorithm for Minimum Vertex Cover ........................ 237 Section 53.1: Algorithm Pseudo Code ...................................................................................................................... 237 Chapter 54: Dynamic Time Warping .............................................................................................................. 238 Section 54.1: Introduction To Dynamic Time Warping .......................................................................................... 238 Chapter 55: Fast Fourier Transform .............................................................................................................. 242 Section 55.1: Radix 2 FFT .......................................................................................................................................... 242 Section 55.2: Radix 2 Inverse FFT ............................................................................................................................ 247 Appendix A: Pseudocode ....................................................................................................................................... 249 Section A.1: Variable aectations ............................................................................................................................ 249 Section A.2: Functions ............................................................................................................................................... 249 Credits ............................................................................................................................................................................ 250 You may also like ...................................................................................................................................................... 252 About Please feel free to share this PDF with anyone for free, latest version of this book can be downloaded from: https://goalkicker.com/AlgorithmsBook This Algorithms Notes for Professionals book is compiled from Stack Overflow Documentation, the content is written by the beautiful people at Stack Overflow. Text content is released under Creative Commons BY-SA, see credits at the end of this book whom contributed to the various chapters. Images may be copyright of their respective owners unless otherwise specified This is an unofficial free book created for educational purposes and is not affiliated with official Algorithms group(s) or company(s) nor Stack Overflow. All trademarks and registered trademarks are the property of their respective company owners The information presented in this book is not guaranteed to be correct nor accurate, use at your own risk Please send feedback and corrections to [email protected] GoalKicker.com – Algorithms Notes for Professionals 1 Chapter 1: Getting started with algorithms Section 1.1: A sample algorithmic problem An algorithmic problem is specified by describing the complete set of instances it must work on and of its output after running on one of these instances. This distinction, between a problem and an instance of a problem, is fundamental. The algorithmic problem known as sorting is defined as follows: [Skiena:2008:ADM:1410219] Problem: Sorting Input: A sequence of n keys, a_1, a_2, ..., a_n. Output: The reordering of the input sequence such that a'_1 <= a'_2 <= ... <= a'_{n-1} <= a'_n An instance of sorting might be an array of strings, such as { Haskell, Emacs } or a sequence of numbers such as { 154, 245, 1337 }. Section 1.2: Getting Started with Simple Fizz Buzz Algorithm in Swift For those of you that are new to programming in Swift and those of you coming from different programming bases, such as Python or Java, this article should be quite helpful. In this post, we will discuss a simple solution for implementing swift algorithms. Fizz Buzz You may have seen Fizz Buzz written as Fizz Buzz, FizzBuzz, or Fizz-Buzz; they're all referring to the same thing. That "thing" is the main topic of discussion today. First, what is FizzBuzz? This is a common question that comes up in job interviews. Imagine a series of a number from 1 to 10. 1 2 3 4 5 6 7 8 9 10 Fizz and Buzz refer to any number that's a multiple of 3 and 5 respectively. In other words, if a number is divisible by 3, it is substituted with fizz; if a number is divisible by 5, it is substituted with buzz. If a number is simultaneously a multiple of 3 AND 5, the number is replaced with "fizz buzz." In essence, it emulates the famous children game "fizz buzz". To work on this problem, open up Xcode to create a new playground and initialize an array like below: // for example let number = [1,2,3,4,5] // here 3 is fizz and 5 is buzz To find all the fizz and buzz, we must iterate through the array and check which numbers are fizz and which are buzz. To do this, create a for loop to iterate through the array we have initialised: for num in number { // Body and calculation goes here } After this, we can simply use the "if else" condition and module operator in swift ie - % to locate the fizz and buzz GoalKicker.com – Algorithms Notes for Professionals 2 for num in number { if num % 3 == 0 { print("\(num) fizz") } else { print(num) } } Great! You can go to the debug console in Xcode playground to see the output. You will find that the "fizzes" have been sorted out in your array. For the Buzz part, we will use the same technique. Let's give it a try before scrolling through the article — you can check your results against this article once you've finished doing this. for num in number { if num % 3 == 0 { print("\(num) fizz") } else if num % 5 == 0 { print("\(num) buzz") } else { print(num) } } Check the output! It's rather straight forward — you divided the number by 3, fizz and divided the number by 5, buzz. Now, increase the numbers in the array let number = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15] We increased the range of numbers from 1-10 to 1-15 in order to demonstrate the concept of a "fizz buzz." Since 15 is a multiple of both 3 and 5, the number should be replaced with "fizz buzz." Try for yourself and check the answer! Here is the solution: for num in number { if num % 3 == 0 && num % 5 == 0 { print("\(num) fizz buzz") } else if num % 3 == 0 { print("\(num) fizz") } else if num % 5 == 0 { print("\(num) buzz") } else { print(num) } } Wait...it's not over though! The whole purpose of the algorithm is to customize the runtime correctly. Imagine if the range increases from 1-15 to 1-100. The compiler will check each number to determine whether it is divisible by 3 or 5. It would then run through the numbers again to check if the numbers are divisible by 3 and 5. The code would essentially have to run through each number in the array twice — it would have to runs the numbers by 3 first and then run it by 5. To speed up the process, we can simply tell our code to divide the numbers by 15 directly. Here is the final code: for num in number { GoalKicker.com – Algorithms Notes for Professionals 3 if num % 15 == 0 { print("\(num) fizz buzz") } else if num % 3 == 0 { print("\(num) fizz") } else if num % 5 == 0 { print("\(num) buzz") } else { print(num) } } As Simple as that, you can use any language of your choice and get started Enjoy Coding GoalKicker.com – Algorithms Notes for Professionals 4 Chapter 2: Algorithm Complexity Section 2.1: Big-Theta notation Unlike Big-O notation, which represents only upper bound of the running time for some algorithm, Big-Theta is a tight bound; both upper and lower bound. Tight bound is more precise, but also more difficult to compute. The Big-Theta notation is symmetric: f(x) = Ө(g(x)) <=> g(x) = Ө(f(x)) An intuitive way to grasp it is that f(x) = Ө(g(x)) means that the graphs of f(x) and g(x) grow in the same rate, or that the graphs 'behave' similarly for big enough values of x. The full mathematical expression of the Big-Theta notation is as follows: Ө(f(x)) = {g: N0 -> R and c1, c2, n0 > 0, where c1 < abs(g(n) / f(n)), for every n > n0 and abs is the absolute value } An example If the algorithm for the input n takes 42n^2 + 25n + 4 operations to finish, we say that is O(n^2), but is also O(n^3) and O(n^100). However, it is Ө(n^2) and it is not Ө(n^3), Ө(n^4) etc. Algorithm that is Ө(f(n)) is also O(f(n)), but not vice versa! Formal mathematical definition Ө(g(x)) is a set of functions. Ө(g(x)) = {f(x) such that there exist positive constants c1, c2, N such that 0 <= c1*g(x) <= f(x) <= c2*g(x) for all x > N} Because Ө(g(x)) is a set, we could write f(x) ∈ Ө(g(x)) to indicate that f(x) is a member of Ө(g(x)). Instead, we will usually write f(x) = Ө(g(x)) to express the same notion - that's the common way. Whenever Ө(g(x)) appears in a formula, we interpret it as standing for some anonymous function that we do not care to name. For example the equation T(n) = T(n/2) + Ө(n), means T(n) = T(n/2) + f(n) where f(n) is a function in the set Ө(n). Let f and g be two functions defined on some subset of the real numbers. We write f(x) = Ө(g(x)) as x->infinity if and only if there are positive constants K and L and a real number x0 such that holds: K|g(x)| <= f(x) <= L|g(x)| for all x >= x0. The definition is equal to: f(x) = O(g(x)) and f(x) = Ω(g(x)) A method that uses limits if limit(x->infinity) f(x)/g(x) = c ∈ (0,∞) i.e. the limit exists and it's positive, then f(x) = Ө(g(x)) Common Complexity Classes Name Notation n = 10 n = 100 Constant Ө(1) 1 1 Logarithmic Ө(log(n)) 3 7 Linear Ө(n) 10 100 GoalKicker.com – Algorithms Notes for Professionals 5 Linearithmic Ө(n*log(n)) 30 700 Quadratic Ө(n^2) 100 10 000 Exponential Ө(2^n) 1 024 1.267650e+ 30 Factorial Ө(n!) 3 628 800 9.332622e+157 Section 2.2: Comparison of the asymptotic notations Let f(n) and g(n) be two functions defined on the set of the positive real numbers, c, c1, c2, n0 are positive real constants. f(n) = Notation f(n) = O(g(n)) f(n) = Ω(g(n)) f(n) = Θ(g(n)) f(n) = o(g(n)) ω(g(n)) ∀ c > ∀ c > 0, ∃ 0, ∃ n0 > n0 > 0 0 : ∀ definition ∃ c > 0, ∃ n0 > 0 : ∀ n ≥ n0, 0 ≤ f(n) ≤ c g(n) ∃ c > 0, ∃ n0 > 0 : ∀ n ≥ n0, 0 ≤ c g(n) ≤ f(n)∃ c1, c2 > 0, ∃ n0 > 0 : ∀ n ≥ n0, 0 ≤ c1 g(n) ≤ Formal Analogy between the f(n) ≤ c2 g(n) : ∀ n ≥ n0, 0 ≤ f(n) < c g(n) n ≥ n0, 0 ≤ c g(n) < f(n) asymptotic comparison of f, g and real numbers a, b a ≤ b a ≥ b a = b a < b a > b 7n^2 Example 7n + 10 = O(n^2 + n - 9) n^3 - 34 = Ω(10n^2 - 7n + 1) 1/2 n^2 - 7n = Θ(n^2)5n^2 = = o(n^3) ω(n) Graphic interpretation The asymptotic notations can be represented on a Venn diagram as follows: Links Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein. Introduction to Algorithms. Section 2.3: Big-Omega Notation Ω-notation is used for asymptotic lower bound. GoalKicker.com – Algorithms Notes for Professionals 6 Formal definition Let f(n) and g(n) be two functions defined on the set of the positive real numbers. We write f(n) = Ω(g(n)) if there are positive constants c and n0 such that: 0 ≤ c g(n) ≤ f(n) for all n ≥ n0. Notes f(n) = Ω(g(n)) means that f(n) grows asymptotically no slower than g(n). Also we can say about Ω(g(n)) when algorithm analysis is not enough for statement about Θ(g(n)) or / and O(g(n)). From the definitions of notations follows the theorem: For two any functions f(n) and g(n) we have f(n) = Ө(g(n)) if and only if f(n) = O(g(n)) and f(n) = Ω(g(n)). Graphically Ω-notation may be represented as follows: For example lets we have f(n) = 3n^2 + 5n - 4. Then f(n) = Ω(n^2). It is also correct f(n) = Ω(n), or even f(n) = Ω(1). Another example to solve perfect matching algorithm : If the number of vertices is odd then output "No Perfect Matching" otherwise try all possible matchings. We would like to say the algorithm requires exponential time but in fact you cannot prove a Ω(n^2) lower bound using the usual definition of Ω since the algorithm runs in linear time for n odd. We should instead define f(n)=Ω(g(n)) by saying for some constant c>0, f(n)≥ c g(n) for infinitely many n. This gives a nice correspondence between upper and lower bounds: f(n)=Ω(g(n)) iff f(n) != o(g(n)). References Formal definition and theorem are taken from the book "Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein. Introduction to Algorithms". GoalKicker.com – Algorithms Notes for Professionals 7 Chapter 3: Big-O Notation Definition The Big-O notation is at its heart a mathematical notation, used to compare the rate of convergence of functions. Let n -> f(n) and n -> g(n) be functions defined over the natural numbers. Then we say that f = O(g) if and only if f(n)/g(n) is bounded when n approaches infinity. In other words, f = O(g) if and only if there exists a constant A, such that for all n, f(n)/g(n) <= A. Actually the scope of the Big-O notation is a bit wider in mathematics but for simplicity I have narrowed it to what is used in algorithm complexity analysis : functions defined on the naturals, that have non-zero values, and the case of n growing to infinity. What does it mean ? Let's take the case of f(n) = 100n^2 + 10n + 1 and g(n) = n^2. It is quite clear that both of these functions tend to infinity as n tends to infinity. But sometimes knowing the limit is not enough, and we also want to know the speed at which the functions approach their limit. Notions like Big-O help compare and classify functions by their speed of convergence. Let's find out if f = O(g) by applying the definition. We have f(n)/g(n) = 100 + 10/n + 1/n^2. Since 10/n is 10 when n is 1 and is decreasing, and since 1/n^2 is 1 when n is 1 and is also decreasing, we have ̀f(n)/g(n) <= 100 + 10 + 1 = 111. The definition is satisfied because we have found a bound of f(n)/g(n) (111) and so f = O(g) (we say that f is a Big-O of n^2). This means that f tends to infinity at approximately the same speed as g. Now this may seem like a strange thing to say, because what we have found is that f is at most 111 times bigger than g, or in other words when g grows by 1, f grows by at most 111. It may seem that growing 111 times faster is not "approximately the same speed". And indeed the Big-O notation is not a very precise way to classify function convergence speed, which is why in mathematics we use the equivalence relationship when we want a precise estimation of speed. But for the purposes of separating algorithms in large speed classes, Big-O is enough. We don't need to separate functions that grow a fixed number of times faster than each other, but only functions that grow infinitely faster than each other. For instance if we take h(n) = n^2*log(n), we see that h(n)/g(n) = log(n) which tends to infinity with n so h is not O(n^2), because h grows infinitely faster than n^2. Now I need to make a side note : you might have noticed that if f = O(g) and g = O(h), then f = O(h). For instance in our case, we have f = O(n^3), and f = O(n^4)... In algorithm complexity analysis, we frequently say f = O(g) to mean that f = O(g) and g = O(f), which can be understood as "g is the smallest Big-O for f". In mathematics we say that such functions are Big-Thetas of each other. How is it used ? When comparing algorithm performance, we are interested in the number of operations that an algorithm performs. This is called time complexity. In this model, we consider that each basic operation (addition, multiplication, comparison, assignment, etc.) takes a fixed amount of time, and we count the number of such operations. We can usually express this number as a function of the size of the input, which we call n. And sadly, this number usually grows to infinity with n (if it doesn't, we say that the algorithm is O(1)). We separate our algorithms in big speed classes defined by Big-O : when we speak about a "O(n^2) algorithm", we mean that the number of operations it performs, expressed as a function of n, is a O(n^2). This says that our algorithm is approximately as fast as an algorithm that would do a number of operations equal to the square of the size of its input, or faster. The "or faster" part is there because I used Big-O instead of Big-Theta, but usually people will say Big-O to mean Big-Theta. GoalKicker.com – Algorithms Notes for Professionals 8 When counting operations, we usually consider the worst case: for instance if we have a loop that can run at most n times and that contains 5 operations, the number of operations we count is 5n. It is also possible to consider the average case complexity. Quick note : a fast algorithm is one that performs few operations, so if the number of operations grows to infinity faster, then the algorithm is slower: O(n) is better than O(n^2). We are also sometimes interested in the space complexity of our algorithm. For this we consider the number of bytes in memory occupied by the algorithm as a function of the size of the input, and use Big-O the same way. Section 3.1: A Simple Loop The following function finds the maximal element in an array: int find_max(const int *array, size_t len) { int max = INT_MIN; for (size_t i = 0; i < len; i++) { if (max < array[i]) { max = array[i]; } } return max; } The input size is the size of the array, which I called len in the code. Let's count the operations. int max = INT_MIN; size_t i = 0; These two assignments are done only once, so that's 2 operations. The operations that are looped are: if (max < array[i]) i++; max = array[i] Since there are 3 operations in the loop, and the loop is done n times, we add 3n to our already existing 2 operations to get 3n + 2. So our function takes 3n + 2 operations to find the max (its complexity is 3n + 2). This is a polynomial where the fastest growing term is a factor of n, so it is O(n). You probably have noticed that "operation" is not very well defined. For instance I said that if (max < array[i]) was one operation, but depending on the architecture this statement can compile to for instance three instructions : one memory read, one comparison and one branch. I have also considered all operations as the same, even though for instance the memory operations will be slower than the others, and their performance will vary wildly due for instance to cache effects. I also have completely ignored the return statement, the fact that a frame will be created for the function, etc. In the end it doesn't matter to complexity analysis, because whatever way I choose to count operations, it will only change the coefficient of the n factor and the constant, so the result will still be O(n). Complexity shows how the algorithm scales with the size of the input, but it isn't the only aspect of performance! Section 3.2: A Nested Loop The following function checks if an array has any duplicates by taking each element, then iterating over the whole array to see if the element is there GoalKicker.com – Algorithms Notes for Professionals 9 _Bool contains_duplicates(const int *array, size_t len) { for (int i = 0; i < len - 1; i++) { for (int j = 0; j < len; j++) { if (i != j && array[i] == array[j]) { return 1; } } } return 0; } The inner loop performs at each iteration a number of operations that is constant with n. The outer loop also does a few constant operations, and runs the inner loop n times. The outer loop itself is run n times. So the operations inside the inner loop are run n^2 times, the operations in the outer loop are run n times, and the assignment to i is done one time. Thus, the complexity will be something like an^2 + bn + c, and since the highest term is n^2, the O notation is O(n^2). As you may have noticed, we can improve the algorithm by avoiding doing the same comparisons multiple times. We can start from i + 1 in the inner loop, because all elements before it will already have been checked against all array elements, including the one at index i + 1. This allows us to drop the i == j check. _Bool faster_contains_duplicates(const int *array, size_t len) { for (int i = 0; i < len - 1; i++) { for (int j = i + 1; j < len; j++) { if (array[i] == array[j]) { return 1; } } } return 0; } Obviously, this second version does less operations and so is more efficient. How does that translate to Big-O notation? Well, now the inner loop body is run 1 + 2 + ... + n - 1 = n(n-1)/2 times. This is still a polynomial of the second degree, and so is still only O(n^2). We have clearly lowered the complexity, since we roughly divided by 2 the number of operations that we are doing, but we are still in the same complexity class as defined by Big-O. In order to lower the complexity to a lower class we would need to divide the number of operations by something that tends to infinity with n. Section 3.3: O(log n) types of Algorithms Let's say we have a problem of size n. Now for each step of our algorithm(which we need write), our original problem becomes half of its previous size(n/2). So at each step, our problem becomes half. Step Problem 1 n/2 2 n/4 3 n/8 4 n/16 When the problem space is reduced(i.e solved completely), it cannot be reduced any further(n becomes equal to 1) after exiting check condition. GoalKicker.com – Algorithms Notes for Professionals 10 1. Let's say at kth step or number of operations: problem-size = 1 2. But we know at kth step, our problem-size should be: problem-size = n/2k 3. From 1 and 2: n/2k = 1 or n = 2k 4. Take log on both sides loge n = k loge2 or k = loge n / loge 2 5. Using formula logx m / logx n = logn m k = log2 n or simply k = log n Now we know that our algorithm can run maximum up to log n, hence time complexity comes as O( log n) A very simple example in code to support above text is : for(int i=1; i<=n; i=i*2) { // perform some operation } So now if some one asks you if n is 256 how many steps that loop( or any other algorithm that cuts down it's problem size into half) will run you can very easily calculate. k = log2 256 k = log2 2 8 ( => logaa = 1) k = 8 Another very good example for similar case is Binary Search Algorithm. GoalKicker.com – Algorithms Notes for Professionals 11 int bSearch(int arr[],int size,int item){ int low=0; int high=size-1; while(low<=high){ mid=low+(high-low)/2; if(arr[mid]==item) return mid; else if(arr[mid] 0: b = h elif L[h] < 0: a = h a and b are the indexes between which 0 is to be found. Each time we enter the loop, we use an index between a and b and use it to narrow the area to be searched. In the worst case, we have to wait until a and b are equal. But how many operations does that take? Not n, because each time we enter the loop, we divide the distance between a and b by about two. Rather, the complexity is O(log n). Explanation Note: When we write "log", we mean the binary logarithm, or log base 2 (which we will write "log_2"). As O(log_2 n) = O(log n) (you can do the math) we will use "log" instead of "log_2". GoalKicker.com – Algorithms Notes for Professionals 12 Let's call x the number of operations: we know that 1 = n / (2^x). So 2^x = n, then x = log n Conclusion When faced with successive divisions (be it by two or by any number), remember that the complexity is logarithmic. GoalKicker.com – Algorithms Notes for Professionals 13 Chapter 4: Trees Section 4.1: Typical anary tree representation Typically we represent an anary tree (one with potentially unlimited children per node) as a binary tree, (one with exactly two children per node). The "next" child is regarded as a sibling. Note that if a tree is binary, this representation creates extra nodes. We then iterate over the siblings and recurse down the children. As most trees are relatively shallow - lots of children but only a few levels of hierarchy, this gives rise to efficient code. Note human genealogies are an exception (lots of levels of ancestors, only a few children per level). If necessary back pointers can be kept to allow the tree to be ascended. These are more difficult to maintain. Note that it is typical to have one function to call on the root and a recursive function with extra parameters, in this case tree depth. struct node { struct node *next; struct node *child; std::string data; } void printtree_r(struct node *node, int depth) { int i; while(node) { if(node->child) { for(i=0;ichild, depth +1); for(i=0;idata.c_str()); node = node->next; } } } void printtree(node *root) { printree_r(root, 0); } Section 4.2: Introduction Trees are a sub-type of the more general node-edge graph data structure. GoalKicker.com – Algorithms Notes for Professionals 14 To be a tree, a graph must satisfy two requirements: It is acyclic. It contains no cycles (or "loops"). It is connected. For any given node in the graph, every node is reachable. All nodes are reachable through one path in the graph. The tree data structure is quite common within computer science. Trees are used to model many different algorithmic data structures, such as ordinary binary trees, red-black trees, B-trees, AB-trees, 23-trees, Heap, and tries. it is common to refer to a Tree as a Rooted Tree by: choosing 1 cell to be called `Root` painting the `Root` at the top creating lower layer for each cell in the graph depending on their distance from the root -the bigger the distance, the lower the cells (example above) common symbol for trees: T Section 4.3: To check if two Binary trees are same or not 1. For example if the inputs are: Example:1 a) GoalKicker.com – Algorithms Notes for Professionals 15 b) Output should be true. Example:2 If the inputs are: a) b) Output should be false. Pseudo code for the same: boolean sameTree(node root1, node root2){ GoalKicker.com – Algorithms Notes for Professionals 16 if(root1 == NULL && root2 == NULL) return true; if(root1 == NULL || root2 == NULL) return false; if(root1->data == root2->data && sameTree(root1->left,root2->left) && sameTree(root1->right, root2->right)) return true; } GoalKicker.com – Algorithms Notes for Professionals 17 Chapter 5: Binary Search Trees Binary tree is a tree that each node in it has maximum of two children. Binary search tree (BST) is a binary tree which its elements positioned in special order. In each BST all values(i.e key) in left sub tree are less than values in right sub tree. Section 5.1: Binary Search Tree - Insertion (Python) This is a simple implementation of Binary Search Tree Insertion using Python. An example is shown below: Following the code snippet each image shows the execution visualization which makes it easier to visualize how this code works. class Node: def __init__(self, val): self.l_child = None self.r_child = None self.data = val def insert(root, node): if root is None: root = node else: if root.data > node.data: if root.l_child is None: root.l_child = node else: insert(root.l_child, node) else: if root.r_child is None: GoalKicker.com – Algorithms Notes for Professionals 18 root.r_child = node else: insert(root.r_child, node) def in_order_print(root): if not root: return in_order_print(root.l_child) print root.data in_order_print(root.r_child) def pre_order_print(root): if not root: return print root.data pre_order_print(root.l_child) pre_order_print(root.r_child) GoalKicker.com – Algorithms Notes for Professionals 19 Section 5.2: Binary Search Tree - Deletion(C++) Before starting with deletion I just want to put some lights on what is a Binary search tree(BST), Each node in a BST can have maximum of two nodes(left and right child).The left sub-tree of a node has a key less than or equal to its parent node's key. The right sub-tree of a node has a key greater than to its parent node's key. Deleting a node in a tree while maintaining its Binary search tree property. There are three cases to be considered while deleting a node. Case 1: Node to be deleted is the leaf node.(Node with value 22). Case 2: Node to be deleted has one child.(Node with value 26). Case 3: Node to be deleted has both children.(Node with value 49). Explanation of cases: 1. When the node to be deleted is a leaf node then simply delete the node and pass nullptr to its parent node. 2. When a node to be deleted is having only one child then copy the child value to the node value and delete the child (Converted to case 1) 3. When a node to be delete is having two childs then the minimum from its right sub tree can be copied to the node and then the minimum value can be deleted from the node's right subtree (Converted to Case 2) Note: The minimum in the right sub tree can have a maximum of one child and that too right child if it's having the left child that means it's not the minimum value or it's not following BST property. GoalKicker.com – Algorithms Notes for Professionals 20 The structure of a node in a tree and the code for Deletion: struct node { int data; node *left, *right; }; node* delete_node(node *root, int data) { if(root == nullptr) return root; else if(data < root->data) root->left = delete_node(root->left, data); else if(data > root->data) root->right = delete_node(root->right, data); else { if(root->left == nullptr && root->right == nullptr) // Case 1 { free(root); root = nullptr; } else if(root->left == nullptr) // Case 2 { node* temp = root; root= root->right; free(temp); } else if(root->right == nullptr) // Case 2 { node* temp = root; root = root->left; free(temp); } else // Case 3 { node* temp = root->right; while(temp->left != nullptr) temp = temp->left; root->data = temp->data; root->right = delete_node(root->right, temp->data); } } return root; } Time complexity of above code is O(h), where h is the height of the tree. Section 5.3: Lowest common ancestor in a BST Consider the BST: GoalKicker.com – Algorithms Notes for Professionals 21 Lowest common ancestor of 22 and 26 is 24 Lowest common ancestor of 26 and 49 is 46 Lowest common ancestor of 22 and 24 is 24 Binary search tree property can be used for finding nodes lowest ancestor Psuedo code: lowestCommonAncestor(root,node1, node2){ if(root == NULL) return NULL; else if(node1->data == root->data || node2->data== root->data) return root; else if((node1->data <= root->data && node2->data > root->data) || (node2->data <= root->data && node1->data > root->data)){ return root; } else if(root->data > max(node1->data,node2->data)){ return lowestCommonAncestor(root->left, node1, node2); } else { return lowestCommonAncestor(root->right, node1, node2); } } Section 5.4: Binary Search Tree - Python class Node(object): def __init__(self, val): self.l_child = None self.r_child = None self.val = val class BinarySearchTree(object): def insert(self, root, node): GoalKicker.com – Algorithms Notes for Professionals 22 if root is None: return node if root.val < node.val: root.r_child = self.insert(root.r_child, node) else: root.l_child = self.insert(root.l_child, node) return root def in_order_place(self, root): if not root: return None else: self.in_order_place(root.l_child) print root.val self.in_order_place(root.r_child) def pre_order_place(self, root): if not root: return None else: print root.val self.pre_order_place(root.l_child) self.pre_order_place(root.r_child) def post_order_place(self, root): if not root: return None else: self.post_order_place(root.l_child) self.post_order_place(root.r_child) print root.val """ Create different node and insert data into it""" r = Node(3) node = BinarySearchTree() nodeList = [1, 8, 5, 12, 14, 6, 15, 7, 16, 8] for nd in nodeList: node.insert(r, Node(nd)) print "------In order ---------" print (node.in_order_place(r)) print "------Pre order ---------" print (node.pre_order_place(r)) print "------Post order ---------" print (node.post_order_place(r)) GoalKicker.com – Algorithms Notes for Professionals 23 Chapter 6: Check if a tree is BST or not Section 6.1: Algorithm to check if a given binary tree is BST A binary tree is BST if it satisfies any one of the following condition: 1. It is empty 2. It has no subtrees 3. For every node x in the tree all the keys (if any) in the left sub tree must be less than key(x) and all the keys (if any) in the right sub tree must be greater than key(x). So a straightforward recursive algorithm would be: is_BST(root): if root == NULL: return true // Check values in left subtree if root->left != NULL: max_key_in_left = find_max_key(root->left) if max_key_in_left > root->key: return false // Check values in right subtree if root->right != NULL: min_key_in_right = find_min_key(root->right) if min_key_in_right < root->key: return false return is_BST(root->left) && is_BST(root->right) The above recursive algorithm is correct but inefficient, because it traverses each node mutiple times. Another approach to minimize the multiple visits of each node is to remember the min and max possible values of the keys in the subtree we are visiting. Let the minimum possible value of any key be K_MIN and maximum value be K_MAX. When we start from the root of the tree, the range of values in the tree is [K_MIN,K_MAX]. Let the key of root node be x. Then the range of values in left subtree is [K_MIN,x) and the range of values in right subtree is (x,K_MAX]. We will use this idea to develop a more efficient algorithm. is_BST(root, min, max): if root == NULL: return true // is the current node key out of range? if root->key < min || root->key > max: return false // check if left and right subtree is BST return is_BST(root->left,min,root->key-1) && is_BST(root->right,root->key+1,max) It will be initially called as: is_BST(my_tree_root,KEY_MIN,KEY_MAX) Another approach will be to do inorder traversal of the Binary tree. If the inorder traversal produces a sorted sequence of keys then the given tree is a BST. To check if the inorder sequence is sorted remember the value of GoalKicker.com – Algorithms Notes for Professionals 24 previously visited node and compare it against the current node. Section 6.2: If a given input tree follows Binary search tree property or not For example if the input is: Output should be false: As 4 in the left sub-tree is greater than the root value(3) If the input is: Output should be true GoalKicker.com – Algorithms Notes for Professionals 25 Chapter 7: Binary Tree traversals Visiting a node of a binary tree in some particular order is called traversals. Section 7.1: Level Order traversal - Implementation For example if the given tree is: Level order traversal will be 1 2 3 4 5 6 7 Printing node data level by level. Code: #include #include #include using namespace std; struct node{ int data; node *left; node *right; }; void levelOrder(struct node *root){ if(root == NULL) return; queue Q; Q.push(root); while(!Q.empty()){ struct node* curr = Q.front(); cout<< curr->data <<" "; if(curr->left != NULL) Q.push(curr-> left); if(curr->right != NULL) Q.push(curr-> right); Q.pop(); } GoalKicker.com – Algorithms Notes for Professionals 26 } struct node* newNode(int data) { struct node* node = (struct node*) malloc(sizeof(struct node)); node->data = data; node->left = NULL; node->right = NULL; return(node); } int main(){ struct node *root = newNode(1); root->left = newNode(2); root->right = newNode(3); root->left->left = newNode(4); root->left->right = newNode(5); root->right->left = newNode(6); root->right->right = newNode(7); printf("Level Order traversal of binary tree is \n"); levelOrder(root); return 0; } Queue data structure is used to achieve the above objective. Section 7.2: Pre-order, Inorder and Post Order traversal of a Binary Tree Consider the Binary Tree: Pre-order traversal(root) is traversing the node then left sub-tree of the node and then the right sub-tree of the node. So the pre-order traversal of above tree will be: 1 2 4 5 3 6 7 In-order traversal(root) is traversing the left sub-tree of the node then the node and then right sub-tree of the GoalKicker.com – Algorithms Notes for Professionals 27 node. So the in-order traversal of above tree will be: 4 2 5 1 6 3 7 Post-order traversal(root) is traversing the left sub-tree of the node then the right sub-tree and then the node. So the post-order traversal of above tree will be: 4 5 2 6 7 3 1 GoalKicker.com – Algorithms Notes for Professionals 28 Chapter 8: Lowest common ancestor of a Binary Tree Lowest common ancestor between two nodes n1 and n2 is defined as the lowest node in the tree that has both n1 and n2 as descendants. Section 8.1: Finding lowest common ancestor Consider the tree: Lowest common ancestor of nodes with value 1 and 4 is 2 Lowest common ancestor of nodes with value 1 and 5 is 3 Lowest common ancestor of nodes with value 2 and 4 is 4 Lowest common ancestor of nodes with value 1 and 2 is 2 GoalKicker.com – Algorithms Notes for Professionals 29 Chapter 9: Graph A graph is a collection of points and lines connecting some (possibly empty) subset of them. The points of a graph are called graph vertices, "nodes" or simply "points." Similarly, the lines connecting the vertices of a graph are called graph edges, "arcs" or "lines." A graph G can be defined as a pair (V,E), where V is a set of vertices, and E is a set of edges between the vertices E ⊆ {(u,v) | u, v ∈ V}. Section 9.1: Storing Graphs (Adjacency Matrix) To store a graph, two methods are common: Adjacency Matrix Adjacency List An adjacency matrix is a square matrix used to represent a finite graph. The elements of the matrix indicate whether pairs of vertices are adjacent or not in the graph. Adjacent means 'next to or adjoining something else' or to be beside something. For example, your neighbors are adjacent to you. In graph theory, if we can go to node B from node A, we can say that node B is adjacent to node A. Now we will learn about how to store which nodes are adjacent to which one via Adjacency Matrix. This means, we will represent which nodes share edge between them. Here matrix means 2D array. Here you can see a table beside the graph, this is our adjacency matrix. Here Matrix[i][j] = 1 represents there is an edge between i and j. If there's no edge, we simply put Matrix[i][j] = 0. These edges can be weighted, like it can represent the distance between two cities. Then we'll put the value in Matrix[i][j] instead of putting 1. The graph described above is Bidirectional or Undirected, that means, if we can go to node 1 from node 2, we can also go to node 2 from node 1. If the graph was Directed, then there would've been arrow sign on one side of the graph. Even then, we could represent it using adjacency matrix. GoalKicker.com – Algorithms Notes for Professionals 30 We represent the nodes that don't share edge by infinity. One thing to be noticed is that, if the graph is undirected, the matrix becomes symmetric. The pseudo-code to create the matrix: Procedure AdjacencyMatrix(N): //N represents the number of nodes Matrix[N][N] for i from 1 to N for j from 1 to N Take input -> Matrix[i][j] endfor endfor We can also populate the Matrix using this common way: Procedure AdjacencyMatrix(N, E): // N -> number of nodes Matrix[N][E] // E -> number of edges for i from 1 to E input -> n1, n2, cost Matrix[n1][n2] = cost Matrix[n2][n1] = cost endfor For directed graphs, we can remove Matrix[n2][n1] = cost line. The drawbacks of using Adjacency Matrix: Memory is a huge problem. No matter how many edges are there, we will always need N * N sized matrix where N is the number of nodes. If there are 10000 nodes, the matrix size will be 4 * 10000 * 10000 around 381 megabytes. This is a huge waste of memory if we consider graphs that have a few edges. Suppose we want to find out to which node we can go from a node u. We'll need to check the whole row of u, which costs a lot of time. The only benefit is that, we can easily find the connection between u-v nodes, and their cost using Adjacency Matrix. Java code implemented using above pseudo-code: import java.util.Scanner; public class Represent_Graph_Adjacency_Matrix { private final int vertices; GoalKicker.com – Algorithms Notes for Professionals 31 private int[][] adjacency_matrix; public Represent_Graph_Adjacency_Matrix(int v) { vertices = v; adjacency_matrix = new int[vertices + 1][vertices + 1]; } public void makeEdge(int to, int from, int edge) { try { adjacency_matrix[to][from] = edge; } catch (ArrayIndexOutOfBoundsException index) { System.out.println("The vertices does not exists"); } } public int getEdge(int to, int from) { try { return adjacency_matrix[to][from]; } catch (ArrayIndexOutOfBoundsException index) { System.out.println("The vertices does not exists"); } return -1; } public static void main(String args[]) { int v, e, count = 1, to = 0, from = 0; Scanner sc = new Scanner(System.in); Represent_Graph_Adjacency_Matrix graph; try { System.out.println("Enter the number of vertices: "); v = sc.nextInt(); System.out.println("Enter the number of edges: "); e = sc.nextInt(); graph = new Represent_Graph_Adjacency_Matrix(v); System.out.println("Enter the edges: "); while (count <= e) { to = sc.nextInt(); from = sc.nextInt(); graph.makeEdge(to, from, 1); count++; } System.out.println("The adjacency matrix for the given graph is: "); System.out.print(" "); for (int i = 1; i <= v; i++) System.out.print(i + " "); System.out.println(); GoalKicker.com – Algorithms Notes for Professionals 32 for (int i = 1; i <= v; i++) { System.out.print(i + " "); for (int j = 1; j <= v; j++) System.out.print(graph.getEdge(i, j) + " "); System.out.println(); } } catch (Exception E) { System.out.println("Somthing went wrong"); } sc.close(); } } Running the code: Save the file and compile using javac Represent_Graph_Adjacency_Matrix.java Example: $ java Represent_Graph_Adjacency_Matrix Enter the number of vertices: 4 Enter the number of edges: 6 Enter the edges: 1 1 3 4 2 3 1 4 2 4 1 2 The adjacency matrix for the given graph is: 1 2 3 4 1 1 1 0 1 2 0 0 1 1 3 0 0 0 1 4 0 0 0 0 Section 9.2: Introduction To Graph Theory Graph Theory is the study of graphs, which are mathematical structures used to model pairwise relations between objects. Did you know, almost all the problems of planet Earth can be converted into problems of Roads and Cities, and solved? Graph Theory was invented many years ago, even before the invention of computer. Leonhard Euler wrote a paper on the Seven Bridges of Königsberg which is regarded as the first paper of Graph Theory. Since then, people have come to realize that if we can convert any problem to this City-Road problem, we can solve it easily by Graph Theory. Graph Theory has many applications.One of the most common application is to find the shortest distance between one city to another. We all know that to reach your PC, this web-page had to travel many routers from the server. Graph Theory helps it to find out the routers that needed to be crossed. During war, which street needs to be bombarded to disconnect the capital city from others, that too can be found out using Graph Theory. GoalKicker.com – Algorithms Notes for Professionals 33 Let us first learn some basic definitions on Graph Theory. Graph: Let's say, we have 6 cities. We mark them as 1, 2, 3, 4, 5, 6. Now we connect the cities that have roads between each other. This is a simple graph where some cities are shown with the roads that are connecting them. In Graph Theory, we call each of these cities Node or Vertex and the roads are called Edge. Graph is simply a connection of these nodes and edges. A node can represent a lot of things. In some graphs, nodes represent cities, some represent airports, some represent a square in a chessboard. Edge represents the relation between each nodes. That relation can be the time to go from one airport to another, the moves of a knight from one square to all the other squares etc. GoalKicker.com – Algorithms Notes for Professionals 34 Path of Knight in a Chessboard In simple words, a Node represents any object and Edge represents the relation between two objects. Adjacent Node: If a node A shares an edge with node B, then B is considered to be adjacent to A. In other words, if two nodes are directly connected, they are called adjacent nodes. One node can have multiple adjacent nodes. Directed and Undirected Graph: In directed graphs, the edges have direction signs on one side, that means the edges are Unidirectional. On the other hand, the edges of undirected graphs have direction signs on both sides, that means they are Bidirectional. Usually undirected graphs are represented with no signs on the either sides of the edges. Let's assume there is a party going on. The people in the party are represented by nodes and there is an edge between two people if they shake hands. Then this graph is undirected because any person A shake hands with person B if and only if B also shakes hands with A. In contrast, if the edges from a person A to another person B corresponds to A's admiring B, then this graph is directed, because admiration is not necessarily reciprocated. The former type of graph is called an undirected graph and the edges are called undirected edges while the latter type of graph is called a directed graph and the edges are called directed edges. Weighted and Unweighted Graph: A weighted graph is a graph in which a number (the weight) is assigned to each edge. Such weights might represent for example costs, lengths or capacities, depending on the problem at hand. GoalKicker.com – Algorithms Notes for Professionals 35 An unweighted graph is simply the opposite. We assume that, the weight of all the edges are same (presumably 1). Path: A path represents a way of going from one node to another. It consists of sequence of edges. There can be multiple paths between two nodes. In the example above, there are two paths from A to D. A->B, B->C, C->D is one path. The cost of this path is 3 + 4 + 2 = 9. Again, there's another path A->D. The cost of this path is 10. The path that costs the lowest is called shortest path. Degree: The degree of a vertex is the number of edges that are connected to it. If there's any edge that connects to the vertex at both ends (a loop) is counted twice. GoalKicker.com – Algorithms Notes for Professionals 36 In directed graphs, the nodes have two types of degrees: In-degree: The number of edges that point to the node. Out-degree: The number of edges that point from the node to other nodes. For undirected graphs, they are simply called degree. Some Algorithms Related to Graph Theory Bellman–Ford algorithm Dijkstra's algorithm Ford–Fulkerson algorithm Kruskal's algorithm Nearest neighbour algorithm Prim's algorithm Depth-first search Breadth-first search Section 9.3: Storing Graphs (Adjacency List) Adjacency list is a collection of unordered lists used to represent a finite graph. Each list describes the set of neighbors of a vertex in a graph. It takes less memory to store graphs. Let's see a graph, and its adjacency matrix: Now we create a list using these values. GoalKicker.com – Algorithms Notes for Professionals 37 This is called adjacency list. It shows which nodes are connected to which nodes. We can store this information using a 2D array. But will cost us the same memory as Adjacency Matrix. Instead we are going to use dynamically allocated memory to store this one. Many languages support Vector or List which we can use to store adjacency list. For these, we don't need to specify the size of the List. We only need to specify the maximum number of nodes. The pseudo-code will be: Procedure Adjacency-List(maxN, E): // maxN denotes the maximum number of nodes edge[maxN] = Vector() // E denotes the number of edges for i from 1 to E input -> x, y // Here x, y denotes there is an edge between x, y edge[x].push(y) edge[y].push(x) end for Return edge Since this one is an undirected graph, it there is an edge from x to y, there is also an edge from y to x. If it was a directed graph, we'd omit the second one. For weighted graphs, we need to store the cost too. We'll create another vector or list named cost[] to store these. The pseudo-code: Procedure Adjacency-List(maxN, E): edge[maxN] = Vector() cost[maxN] = Vector() for i from 1 to E input -> x, y, w edge[x].push(y) cost[x].push(w) end for Return edge, cost From this one, we can easily find out the total number of nodes connected to any node, and what these nodes are. GoalKicker.com – Algorithms Notes for Professionals 38 It takes less time than Adjacency Matrix. But if we needed to find out if there's an edge between u and v, it'd have been easier if we kept an adjacency matrix. Section 9.4: Topological Sort A topological ordering, or a topological sort, orders the vertices in a directed acyclic graph on a line, i.e. in a list, such that all directed edges go from left to right. Such an ordering cannot exist if the graph contains a directed cycle because there is no way that you can keep going right on a line and still return back to where you started from. Formally, in a graph G = (V, E), then a linear ordering of all its vertices is such that if G contains an edge (u, v) ∈ Efrom vertex u to vertex v then u precedes v in the ordering. It is important to note that each DAG has at least one topological sort. There are known algorithms for constructing a topological ordering of any DAG in linear time, one example is: 1. Call depth_first_search(G) to compute finishing times v.f for each vertex v 2. As each vertex is finished, insert it into the front of a linked list 3. the linked list of vertices, as it is now sorted. A topological sort can be performed in (V + E) time, since the depth-first search algorithm takes (V + E) time and it takes Ω(1) (constant time) to insert each of |V| vertices into the front of a linked list. Many applications use directed acyclic graphs to indicate precedences among events. We use topological sorting so that we get an ordering to process each vertex before any of its successors. Vertices in a graph may represent tasks to be performed and the edges may represent constraints that one task must be performed before another; a topological ordering is a valid sequence to perform the tasks set of tasks described in V. Problem instance and its solution Let a vertice v describe a Task(hours_to_complete: int), i. e. Task(4) describes a Task that takes 4 hours to complete, and an edge e describe a Cooldown(hours: int) such that Cooldown(3) describes a duration of time to cool down after a completed task. Let our graph be called dag (since it is a directed acyclic graph), and let it contain 5 vertices: A <- dag.add_vertex(Task(4)); B <- dag.add_vertex(Task(5)); C <- dag.add_vertex(Task(3)); D <- dag.add_vertex(Task(2)); E <- dag.add_vertex(Task(7)); where we connect the vertices with directed edges such that the graph is acyclic, // A ---> C ----+ // | | | // v v v // B ---> D --> E dag.add_edge(A, B, Cooldown(2)); dag.add_edge(A, C, Cooldown(2)); dag.add_edge(B, D, Cooldown(1)); dag.add_edge(C, D, Cooldown(1)); dag.add_edge(C, E, Cooldown(1)); dag.add_edge(D, E, Cooldown(3)); GoalKicker.com – Algorithms Notes for Professionals 39 then there are three possible topological orderings between A and E, 1. A -> B -> D -> E 2. A -> C -> D -> E 3. A -> C -> E Section 9.5: Detecting a cycle in a directed graph using Depth First Traversal A cycle in a directed graph exists if there's a back edge discovered during a DFS. A back edge is an edge from a node to itself or one of the ancestors in a DFS tree. For a disconnected graph, we get a DFS forest, so you have to iterate through all vertices in the graph to find disjoint DFS trees. C++ implementation: #include #include using namespace std; #define NUM_V 4 bool helper(list *graph, int u, bool* visited, bool* recStack) { visited[u]=true; recStack[u]=true; list::iterator i; for(i = graph[u].begin();i!=graph[u].end();++i) { if(recStack[*i]) //if vertice v is found in recursion stack of this DFS traversal return true; else if(*i==u) //if there's an edge from the vertex to itself return true; else if(!visited[*i]) { if(helper(graph, *i, visited, recStack)) return true; } } recStack[u]=false; return false; } /* /The wrapper function calls helper function on each vertices which have not been visited. Helper function returns true if it detects a back edge in the subgraph(tree) or false. */ bool isCyclic(list *graph, int V) { bool visited[V]; //array to track vertices already visited bool recStack[V]; //array to track vertices in recursion stack of the traversal. for(int i = 0;i* graph = new list[NUM_V]; graph[0].push_back(1); graph[0].push_back(2); graph[1].push_back(2); graph[2].push_back(0); graph[2].push_back(3); graph[3].push_back(3); bool res = isCyclic(graph, NUM_V); cout<=2) and only consider the spanning forests with length limit b^k. Merge the components which are exactly the same but with different k, and call the minimum k the level of the component. Then logically make components into a tree. u is the parent of v iff u is the smallest component distinct from v that fully contains v. The root is the whole graph and the leaves are single vertices in the original graph (with the level of negative infinity). The tree still has only O(n) nodes. Maintain the distance of each component to the source (like in Dijkstra's algorithm). The distance of a component with more than one vertices is the minimum distance of its unexpanded children. Set the distance of the source vertex to 0 and update the ancestors accordingly. Consider the distances in base b. When visiting a node in level k the first time, put its children into buckets shared by all nodes of level k (as in bucket sort, replacing the heap in Dijkstra's algorithm) by the digit k and higher of its distance. Each time visiting a node, consider only its first b buckets, visit and remove each of them, update the distance of the current node, and relink the current node to its own parent using the new distance and wait for the next visit for the following buckets. When a leaf is visited, the current distance is the final distance of the vertex. Expand all edges from it in the original graph and update the distances accordingly. Visit the root node (whole graph) repeatedly until the destination is reached. It is based on the fact that, there isn't an edge with length less than l between two connected components of the spanning forest with length limitation l, so, starting at distance x, you could focus only on one connected component until you reach the distance x + l. You'll visit some vertices before vertices with shorter distance are all visited, but that doesn't matter because it is known there won't be a shorter path to here from those vertices. Other parts work like the bucket sort / MSD radix sort, and of course, it requires the O(m) spanning tree. GoalKicker.com – Algorithms Notes for Professionals 42 Chapter 10: Graph Traversals Section 10.1: Depth First Search traversal function The function takes the argument of the current node index, adjacency list (stored in vector of vectors in this example), and vector of boolean to keep track of which node has been visited. void dfs(int node, vector>* graph, vector* visited) { // check whether node has been visited before if((*visited)[node]) return; // set as visited to avoid visiting the same node twice (*visited)[node] = true; // perform some action here cout << node; // traverse to the adjacent nodes in depth-first manner for(int i = 0; i < (*graph)[node].size(); ++i) dfs((*graph)[node][i], graph, visited); } GoalKicker.com – Algorithms Notes for Professionals 43 Chapter 11: Dijkstra’s Algorithm Section 11.1: Dijkstra's Shortest Path Algorithm Before proceeding, it is recommended to have a brief idea about Adjacency Matrix and BFS Dijkstra's algorithm is known as single-source shortest path algorithm. It is used for finding the shortest paths between nodes in a graph, which may represent, for example, road networks. It was conceived by Edsger W. Dijkstra in 1956 and published three years later. We can find shortest path using Breadth First Search (BFS) searching algorithm. This algorithm works fine, but the problem is, it assumes the cost of traversing each path is same, that means the cost of each edge is same. Dijkstra's algorithm helps us to find the shortest path where the cost of each path is not the same. At first we will see, how to modify BFS to write Dijkstra's algorithm, then we will add priority queue to make it a complete Dijkstra's algorithm. Let's say, the distance of each node from the source is kept in d[] array. As in, d[3] represents that d[3] time is taken to reach node 3 from source. If we don't know the distance, we will store infinity in d[3]. Also, let cost[u][v] represent the cost of u-v. That means it takes cost[u][v] to go from u node to v node. We need to understand Edge Relaxation. Let's say, from your house, that is source, it takes 10 minutes to go to place A. And it takes 25 minutes to go to place B. We have, d[A] = 10 d[B] = 25 Now let's say it takes 7 minutes to go from place A to place B, that means: cost[A][B] = 7 Then we can go to place B from source by going to place A from source and then from place A, going to place B, which will take 10 + 7 = 17 minutes, instead of 25 minutes. So, d[A] + cost[A][B] < d[B] Then we update, d[B] = d[A] + cost[A][B] This is called relaxation. We will go from node u to node v and if d[u] + cost[u][v] < d[v] then we will update d[v] = d[u] + cost[u][v]. In BFS, we didn't need to visit any node twice. We only checked if a node is visited or not. If it was not visited, we pushed the node in queue, marked it as visited and incremented the distance by 1. In Dijkstra, we can push a node GoalKicker.com – Algorithms Notes for Professionals 44 in queue and instead of updating it with visited, we relax or update the new edge. Let's look at one example: Let's assume, Node 1 is the Source. Then, d[1] = 0 d[2] = d[3] = d[4] = infinity (or a large value) We set, d[2], d[3] and d[4] to infinity because we don't know the distance yet. And the distance of source is of course 0. Now, we go to other nodes from source and if we can update them, then we'll push them in the queue. Say for example, we'll traverse edge 1-2. As d[1] + 2 < d[2] which will make d[2] = 2. Similarly, we'll traverse edge 1-3 which makes d[3] = 5. We can clearly see that 5 is not the shortest distance we can cross to go to node 3. So traversing a node only once, like BFS, doesn't work here. If we go from node 2 to node 3 using edge 2-3, we can update d[3] = d[2] + 1 = 3. So we can see that one node can be updated many times. How many times you ask? The maximum number of times a node can be updated is the number of in-degree of a node. Let's see the pseudo-code for visiting any node multiple times. We will simply modify BFS: procedure BFSmodified(G, source): Q = queue() distance[] = infinity Q.enqueue(source) distance[source]=0 while Q is not empty u <- Q.pop() for all edges from u to v in G.adjacentEdges(v) do if distance[u] + cost[u][v] < distance[v] distance[v] = distance[u] + cost[u][v] end if end for end while Return distance GoalKicker.com – Algorithms Notes for Professionals 45 This can be used to find the shortest path of all node from the source. The complexity of this code is not so good. Here's why, In BFS, when we go from node 1 to all other nodes, we follow first come, first serve method. For example, we went to node 3 from source before processing node 2. If we go to node 3 from source, we update node 4 as 5 + 3 = 8. When we again update node 3 from node 2, we need to update node 4 as 3 + 3 = 6 again! So node 4 is updated twice. Dijkstra proposed, instead of going for First come, first serve method, if we update the nearest nodes first, then it'll take less updates. If we processed node 2 before, then node 3 would have been updated before, and after updating node 4 accordingly, we'd easily get the shortest distance! The idea is to choose from the queue, the node, that is closest to the source. So we will use Priority Queue here so that when we pop the queue, it will bring us the closest node u from source. How will it do that? It'll check the value of d[u] with it. Let's see the pseudo-code: procedure dijkstra(G, source): Q = priority_queue() distance[] = infinity Q.enqueue(source) distance[source] = 0 while Q is not empty u <- nodes in Q with minimum distance[] remove u from the Q for all edges from u to v in G.adjacentEdges(v) do if distance[u] + cost[u][v] < distance[v] distance[v] = distance[u] + cost[u][v] Q.enqueue(v) end if end for end while Return distance The pseudo-code returns distance of all other nodes from the source. If we want to know distance of a single node v, we can simply return the value when v is popped from the queue. Now, does Dijkstra's Algorithm work when there's a negative edge? If there's a negative cycle, then infinity loop will occur, as it will keep reducing the cost every time. Even if there is a negative edge, Dijkstra won't work, unless we return right after the target is popped. But then, it won't be a Dijkstra algorithm. We'll need Bellman–Ford algorithm for processing negative edge/cycle. Complexity: The complexity of BFS is O(log(V+E)) where V is the number of nodes and E is the number of edges. For Dijkstra, the complexity is similar, but sorting of Priority Queue takes O(logV). So the total complexity is: O(Vlog(V)+E) Below is a Java example to solve Dijkstra's Shortest Path Algorithm using Adjacency Matrix import java.util.*; import java.lang.*; import java.io.*; class ShortestPath { static final int V=9; int minDistance(int dist[], Boolean sptSet[]) { GoalKicker.com – Algorithms Notes for Professionals 46 int min = Integer.MAX_VALUE, min_index=-1; for (int v = 0; v < V; v++) if (sptSet[v] == false && dist[v] <= min) { min = dist[v]; min_index = v; } return min_index; } void printSolution(int dist[], int n) { System.out.println("Vertex Distance from Source"); for (int i = 0; i < V; i++) System.out.println(i+" \t\t "+dist[i]); } void dijkstra(int graph[][], int src) { Boolean sptSet[] = new Boolean[V]; for (int i = 0; i < V; i++) { dist[i] = Integer.MAX_VALUE; sptSet[i] = false; } dist[src] = 0; for (int count = 0; count < V-1; count++) { int u = minDistance(dist, sptSet); sptSet[u] = true; for (int v = 0; v < V; v++) if (!sptSet[v] && graph[u][v]!=0 && dist[u] != Integer.MAX_VALUE && dist[u]+graph[u][v] < dist[v]) dist[v] = dist[u] + graph[u][v]; } printSolution(dist, V); } public static void main (String[] args) { int graph[][] = new int[][]{{0, 4, 0, 0, 0, 0, 0, 8, 0}, {4, 0, 8, 0, 0, 0, 0, 11, 0}, {0, 8, 0, 7, 0, 4, 0, 0, 2}, {0, 0, 7, 0, 9, 14, 0, 0, 0}, {0, 0, 0, 9, 0, 10, 0, 0, 0}, {0, 0, 4, 14, 10, 0, 2, 0, 0}, {0, 0, 0, 0, 0, 2, 0, 1, 6}, {8, 11, 0, 0, 0, 0, 1, 0, 7}, {0, 0, 2, 0, 0, 0, 6, 7, 0} }; ShortestPath t = new ShortestPath(); GoalKicker.com – Algorithms Notes for Professionals 47 t.dijkstra(graph, 0); } } Expected output of the program is Vertex Distance from Source 0 0 1 4 2 12 3 19 4 21 5 11 6 9 7 8 8 14 GoalKicker.com – Algorithms Notes for Professionals 48 Chapter 12: A* Pathfinding Section 12.1: Introduction to A* A* (A star) is a search algorithm that is used for finding path from one node to another. So it can be compared with Breadth First Search, or Dijkstra’s algorithm, or Depth First Search, or Best First Search. A* algorithm is widely used in graph search for being better in efficiency and accuracy, where graph pre-processing is not an option. A* is a an specialization of Best First Search , in which the function of evaluation f is define in a particular way. f(n) = g(n) + h(n) is the minimum cost since the initial node to the objectives conditioned to go thought node n. g(n) is the minimum cost from the initial node to n. h(n) is the minimum cost from n to the closest objective to n A* is an informed search algorithm and it always guarantees to find the smallest path (path with minimum cost) in the least possible time (if uses admissible heuristic). So it is both complete and optimal. The following animation demonstrates A* search Section 12.2: A* Pathfinding through a maze with no obstacles Let's say we have the following 4 by 4 grid: GoalKicker.com – Algorithms Notes for Professionals 49 Let's assume that this is a maze. There are no walls/obstacles, though. We only have a starting point (the green square), and an ending point (the red square). Let's also assume that in order to get from green to red, we cannot move diagonally. So, starting from the green square, let's see which squares we can move to, and highlight them in blue: GoalKicker.com – Algorithms Notes for Professionals 50 In order to choose which square to move to next, we need to take into account 2 heuristics: 1. The "g" value - This is how far away this node is from the green square. 2. The "h" value - This is how far away this node is from the red square. 3. The "f" value - This is the sum of the "g" value and the "h" value. This is the final number which tells us which node to move to. In order to calculate these heuristics, this is the formula we will use: distance = abs(from.x - to.x) + abs(from.y - to.y) This is known as the "Manhattan Distance" formula. Let's calculate the "g" value for the blue square immediately to the left of the green square: abs(3 - 2) + abs(2 - 2) = 1 Great! We've got the value: 1. Now, let's try calculating the "h" value: abs(2 - 0) + abs(2 - 0) = 4 Perfect. Now, let's get the "f" value: 1 + 4 = 5 So, the final value for this node is "5". Let's do the same for all the other blue squares. The big number in the center of each square is the "f" value, while the number on the top left is the "g" value, and the number on the top right is the "h" value: GoalKicker.com – Algorithms Notes for Professionals 51 We've calculated the g, h, and f values for all of the blue nodes. Now, which do we pick? Whichever one has the lowest f value. However, in this case, we have 2 nodes with the same f value, 5. How do we pick between them? Simply, either choose one at random, or have a priority set. I usually prefer to have a priority like so: "Right > Up > Down > Left" One of the nodes with the f value of 5 takes us in the "Down" direction, and the other takes us "Left". Since Down is at a higher priority than Left, we choose the square which takes us "Down". I now mark the nodes which we calculated the heuristics for, but did not move to, as orange, and the node which we chose as cyan: GoalKicker.com – Algorithms Notes for Professionals 52 Alright, now let's calculate the same heuristics for the nodes around the cyan node: Again, we choose the node going down from the cyan node, as all the options have the same f value: GoalKicker.com – Algorithms Notes for Professionals 53 Let's calculate the heuristics for the only neighbour that the cyan node has: Alright, since we will follow the same pattern we have been following: GoalKicker.com – Algorithms Notes for Professionals 54 Once more, let's calculate the heuristics for the node's neighbour: Let's move there: GoalKicker.com – Algorithms Notes for Professionals 55 Finally, we can see that we have a winning square beside us, so we move there, and we are done. Section 12.3: Solving 8-puzzle problem using A* algorithm Problem definition: An 8 puzzle is a simple game consisting of a 3 x 3 grid (containing 9 squares). One of the squares is empty. The object is to move to squares around into different positions and having the numbers displayed in the "goal state". Given an initial state of 8-puzzle game and a final state of to be reached, find the most cost-effective path to reach the final state from initial state. Initial state: _ 1 3 4 2 5 7 8 6 GoalKicker.com – Algorithms Notes for Professionals 56 Final state: 1 2 3 4 5 6 7 8 _ Heuristic to be assumed: Let us consider the Manhattan distance between the current and final state as the heuristic for this problem statement. h(n) = | x - p | + | y - q | where x and y are cell co-ordinates in the current state p and q are cell co-ordinates in the final state Total cost function: So the total cost function f(n) is given by, f(n) = g(n) + h(n), where g(n) is the cost required to reach the current state from given initial state Solution to example problem: First we find the heuristic value required to reach the final state from initial state. The cost function, g(n) = 0, as we are in the initial state h(n) = 8 The above value is obtained, as 1 in the current state is 1 horizontal distance away than the 1 in final state. Same goes for 2, 5, 6. _ is 2 horizontal distance away and 2 vertical distance away. So total value for h(n) is 1 + 1 + 1 + 1 + 2 + 2 = 8. Total cost function f(n) is equal to 8 + 0 = 8. Now, the possible states that can be reached from initial state are found and it happens that we can either move _ to right or downwards. So states obtained after moving those moves are: 1 _ 3 4 1 3 4 2 5 _ 2 5 7 8 6 7 8 6 (1) (2) Again the total cost function is computed for these states using the method described above and it turns out to be 6 and 7 respectively. We chose the state with minimum cost which is state (1). The next possible moves can be Left, Right or Down. We won't move Left as we were previously in that state. So, we can move Right or Down. Again we find the states obtained from (1). 1 3 _ 1 2 3 GoalKicker.com – Algorithms Notes for Professionals 57 4 2 5 4 _ 5 7 8 6 7 8 6 (3) (4) (3) leads to cost function equal to 6 and (4) leads to 4. Also, we will consider (2) obtained before which has cost function equal to 7. Choosing minimum from them leads to (4). Next possible moves can be Left or Right or Down. We get states: 1 2 3 1 2 3 1 2 3 _ 4 5 4 5 _ 4 8 5 7 8 6 7 8 6 7 _ 6 (5) (6) (7) We get costs equal to 5, 2 and 4 for (5), (6) and (7) respectively. Also, we have previous states (3) and (2) with 6 and 7 respectively. We chose minimum cost state which is (6). Next possible moves are Up, and Down and clearly Down will lead us to final state leading to heuristic function value equal to 0. GoalKicker.com – Algorithms Notes for Professionals 58 Chapter 13: A* Pathfinding Algorithm This topic is going to focus on the A* Pathfinding algorithm, how it's used, and why it works. Note to future contributors: I have added an example for A* Pathfinding without any obstacles, on a 4x4 grid. An example with obstacles is still needed. Section 13.1: Simple Example of A* Pathfinding: A maze with no obstacles Let's say we have the following 4 by 4 grid: Let's assume that this is a maze. There are no walls/obstacles, though. We only have a starting point (the green square), and an ending point (the red square). Let's also assume that in order to get from green to red, we cannot GoalKicker.com – Algorithms Notes for Professionals 59 move diagonally. So, starting from the green square, let's see which squares we can move to, and highlight them in blue: In order to choose which square to move to next, we need to take into account 2 heuristics: 1. The "g" value - This is how far away this node is from the green square. 2. The "h" value - This is how far away this node is from the red square. 3. The "f" value - This is the sum of the "g" value and the "h" value. This is the final number which tells us which node to move to. In order to calculate these heuristics, this is the formula we will use: distance = abs(from.x - to.x) + abs(from.y - to.y) This is known as the "Manhattan Distance" formula. Let's calculate the "g" value for the blue square immediately to the left of the green square: abs(3 - 2) + abs(2 - 2) = 1 Great! We've got the value: 1. Now, let's try calculating the "h" value: abs(2 - 0) + abs(2 - 0) = 4 Perfect. Now, let's get the "f" value: 1 + 4 = 5 So, the final value for this node is "5". Let's do the same for all the other blue squares. The big number in the center of each square is the "f" value, while the number on the top left is the "g" value, and the number on the top right is the "h" value: GoalKicker.com – Algorithms Notes for Professionals 60 We've calculated the g, h, and f values for all of the blue nodes. Now, which do we pick? Whichever one has the lowest f value. However, in this case, we have 2 nodes with the same f value, 5. How do we pick between them? Simply, either choose one at random, or have a priority set. I usually prefer to have a priority like so: "Right > Up > Down > Left" One of the nodes with the f value of 5 takes us in the "Down" direction, and the other takes us "Left". Since Down is at a higher priority than Left, we choose the square which takes us "Down". I now mark the nodes which we calculated the heuristics for, but did not move to, as orange, and the node which we chose as cyan: GoalKicker.com – Algorithms Notes for Professionals 61 Alright, now let's calculate the same heuristics for the nodes around the cyan node: Again, we choose the node going down from the cyan node, as all the options have the same f value: GoalKicker.com – Algorithms Notes for Professionals 62 Let's calculate the heuristics for the only neighbour that the cyan node has: Alright, since we will follow the same pattern we have been following: GoalKicker.com – Algorithms Notes for Professionals 63 Once more, let's calculate the heuristics for the node's neighbour: Let's move there: GoalKicker.com – Algorithms Notes for Professionals 64 Finally, we can see that we have a winning square beside us, so we move there, and we are done. GoalKicker.com – Algorithms Notes for Professionals 65 Chapter 14: Dynamic Programming Dynamic programming is a widely used concept and its often used for optimization. It refers to simplifying a complicated problem by breaking it down into simpler sub-problems in a recursive manner usually a bottom-up approach. There are two key attributes that a problem must have in order for dynamic programming to be applicable "Optimal substructure" and "Overlapping sub-problems". To achieve its optimization, dynamic programming uses a concept called memoization Section 14.1: Edit Distance The problem statement is like if we are given two string str1 and str2 then how many minimum number of operations can be performed on the str1 that it gets converted to str2. Implementation in Java public class EditDistance { public static void main(String[] args) { // TODO Auto-generated method stub String str1 = "march"; String str2 = "cart"; EditDistance ed = new EditDistance(); System.out.println(ed.getMinConversions(str1, str2)); } public int getMinConversions(String str1, String str2){ int dp[][] = new int[str1.length()+1][str2.length()+1]; for(int i=0;i<=str1.length();i++){ for(int j=0;j<=str2.length();j++){ if(i==0) dp[i][j] = j; else if(j==0) dp[i][j] = i; else if(str1.charAt(i-1) == str2.charAt(j-1)) dp[i][j] = dp[i-1][j-1]; else{ dp[i][j] = 1 + Math.min(dp[i-1][j], Math.min(dp[i][j-1], dp[i-1][j-1])); } } } return dp[str1.length()][str2.length()]; } } Output 3 Section 14.2: Weighted Job Scheduling Algorithm Weighted Job Scheduling Algorithm can also be denoted as Weighted Activity Selection Algorithm. The problem is, given certain jobs with their start time and end time, and a profit you make when you finish the job, what is the maximum profit you can make given no two jobs can be executed in parallel? GoalKicker.com – Algorithms Notes for Professionals 66 This one looks like Activity Selection using Greedy Algorithm, but there's an added twist. That is, instead of maximizing the number of jobs finished, we focus on making the maximum profit. The number of jobs performed doesn't matter here. Let's look at an example: +-------------------------+---------+---------+---------+---------+---------+---------+ | Name | A | B | C | D | E | F | +-------------------------+---------+---------+---------+---------+---------+---------+ |(Start Time, Finish Time)| (2,5) | (6,7) | (7,9) | (1,3) | (5,8) | (4,6) | +-------------------------+---------+---------+---------+---------+---------+---------+ | Profit | 6 | 4 | 2 | 5 | 11 | 5 | +-------------------------+---------+---------+---------+---------+---------+---------+ The jobs are denoted with a name, their start and finishing time and profit. After a few iterations, we can find out if we perform Job-A and Job-E, we can get the maximum profit of 17. Now how to find this out using an algorithm? The first thing we do is sort the jobs by their finishing time in non-decreasing order. Why do we do this? It's because if we select a job that takes less time to finish, then we leave the most amount of time for choosing other jobs. We have: +-------------------------+---------+---------+---------+---------+---------+---------+ | Name | D | A | F | B | E | C | +-------------------------+---------+---------+---------+---------+---------+---------+ |(Start Time, Finish Time)| (1,3) | (2,5) | (4,6) | (6,7) | (5,8) | (7,9) | +-------------------------+---------+---------+---------+---------+---------+---------+ | Profit | 5 | 6 | 5 | 4 | 11 | 2 | +-------------------------+---------+---------+---------+---------+---------+---------+ We'll have an additional temporary array Acc_Prof of size n (Here, n denotes the total number of jobs). This will contain the maximum accumulated profit of performing the jobs. Don't get it? Wait and watch. We'll initialize the values of the array with the profit of each jobs. That means, Acc_Prof[i] will at first hold the profit of performing i-th job. +-------------------------+---------+---------+---------+---------+---------+---------+ | Acc_Prof | 5 | 6 | 5 | 4 | 11 | 2 | +-------------------------+---------+---------+---------+---------+---------+---------+ Now let's denote position 2 with i, and position 1 will be denoted with j. Our strategy will be to iterate j from 1 to i-1 and after each iteration, we will increment i by 1, until i becomes n+1. j i +-------------------------+---------+---------+---------+---------+---------+---------+ | Name | D | A | F | B | E | C | +-------------------------+---------+---------+---------+---------+---------+---------+ |(Start Time, Finish Time)| (1,3) | (2,5) | (4,6) | (6,7) | (5,8) | (7,9) | +-------------------------+---------+---------+---------+---------+---------+---------+ | Profit | 5 | 6 | 5 | 4 | 11 | 2 | +-------------------------+---------+---------+---------+---------+---------+---------+ | Acc_Prof | 5 | 6 | 5 | 4 | 11 | 2 | +-------------------------+---------+---------+---------+---------+---------+---------+ GoalKicker.com – Algorithms Notes for Professionals 67 We check if Job[i] and Job[j] overlap, that is, if the finish time of Job[j] is greater than Job[i]'s start time, then these two jobs can't be done together. However, if they don't overlap, we'll check if Acc_Prof[j] + Profit[i] > Acc_Prof[i]. If this is the case, we will update Acc_Prof[i] = Acc_Prof[j] + Profit[i]. That is: if Job[j].finish_time <= Job[i].start_time if Acc_Prof[j] + Profit[i] > Acc_Prof[i] Acc_Prof[i] = Acc_Prof[j] + Profit[i] endif endif Here Acc_Prof[j] + Profit[i] represents the accumulated profit of doing these two jobs toegther. Let's check it for our example: Here Job[j] overlaps with Job[i]. So these to can't be done together. Since our j is equal to i-1, we increment the value of i to i+1 that is 3. And we make j = 1. j i +-------------------------+---------+---------+---------+---------+---------+---------+ | Name | D | A | F | B | E | C | +-------------------------+---------+---------+---------+---------+---------+---------+ |(Start Time, Finish Time)| (1,3) | (2,5) | (4,6) | (6,7) | (5,8) | (7,9) | +-------------------------+---------+---------+---------+---------+---------+---------+ | Profit | 5 | 6 | 5 | 4 | 11 | 2 | +-------------------------+---------+---------+---------+---------+---------+---------+ | Acc_Prof | 5 | 6 | 5 | 4 | 11 | 2 | +-------------------------+---------+---------+---------+---------+---------+---------+ Now Job[j] and Job[i] don't overlap. The total amount of profit we can make by picking these two jobs is: Acc_Prof[j] + Profit[i] = 5 + 5 = 10 which is greater than Acc_Prof[i]. So we update Acc_Prof[i] = 10. We also increment j by 1. We get, j i +-------------------------+---------+---------+---------+---------+---------+---------+ | Name | D | A | F | B | E | C | +-------------------------+---------+---------+---------+---------+---------+---------+ |(Start Time, Finish Time)| (1,3) | (2,5) | (4,6) | (6,7) | (5,8) | (7,9) | +-------------------------+---------+---------+---------+---------+---------+---------+ | Profit | 5 | 6 | 5 | 4 | 11 | 2 | +-------------------------+---------+---------+---------+---------+---------+---------+ | Acc_Prof | 5 | 6 | 10 | 4 | 11 | 2 | +-------------------------+---------+---------+---------+---------+---------+---------+ Here, Job[j] overlaps with Job[i] and j is also equal to i-1. So we increment i by 1, and make j = 1. We get, j i +-------------------------+---------+---------+---------+---------+---------+---------+ | Name | D | A | F | B | E | C | +-------------------------+---------+---------+---------+---------+---------+---------+ |(Start Time, Finish Time)| (1,3) | (2,5) | (4,6) | (6,7) | (5,8) | (7,9) | +-------------------------+---------+---------+---------+---------+---------+---------+ | Profit | 5 | 6 | 5 | 4 | 11 | 2 | +-------------------------+---------+---------+---------+---------+---------+---------+ | Acc_Prof | 5 | 6 | 10 | 4 | 11 | 2 | GoalKicker.com – Algorithms Notes for Professionals 68 +-------------------------+---------+---------+---------+---------+---------+---------+ Now, Job[j] and Job[i] don't overlap, we get the accumulated profit 5 + 4 = 9, which is greater than Acc_Prof[i]. We update Acc_Prof[i] = 9 and increment j by 1. j i +-------------------------+---------+---------+---------+---------+---------+---------+ | Name | D | A | F | B | E | C | +-------------------------+---------+---------+---------+---------+---------+---------+ |(Start Time, Finish Time)| (1,3) | (2,5) | (4,6) | (6,7) | (5,8) | (7,9) | +-------------------------+---------+---------+---------+---------+---------+---------+ | Profit | 5 | 6 | 5 | 4 | 11 | 2 | +-------------------------+---------+---------+---------+---------+---------+---------+ | Acc_Prof | 5 | 6 | 10 | 9 | 11 | 2 | +-------------------------+---------+---------+---------+---------+---------+---------+ Again Job[j] and Job[i] don't overlap. The accumulated profit is: 6 + 4 = 10, which is greater than Acc_Prof[i]. We again update Acc_Prof[i] = 10. We increment j by 1. We get: j i +-------------------------+---------+---------+---------+---------+---------+---------+ | Name | D | A | F | B | E | C | +-------------------------+---------+---------+---------+---------+---------+---------+ |(Start Time, Finish Time)| (1,3) | (2,5) | (4,6) | (6,7) | (5,8) | (7,9) | +-------------------------+---------+---------+---------+---------+---------+---------+ | Profit | 5 | 6 | 5 | 4 | 11 | 2 | +-------------------------+---------+---------+---------+---------+---------+---------+ | Acc_Prof | 5 | 6 | 10 | 10 | 11 | 2 | +-------------------------+---------+---------+---------+---------+---------+---------+ If we continue this process, after iterating through the whole table using i, our table will finally look like: +-------------------------+---------+---------+---------+---------+---------+---------+ | Name | D | A | F | B | E | C | +-------------------------+---------+---------+---------+---------+---------+---------+ |(Start Time, Finish Time)| (1,3) | (2,5) | (4,6) | (6,7) | (5,8) | (7,9) | +-------------------------+---------+---------+---------+---------+---------+---------+ | Profit | 5 | 6 | 5 | 4 | 11 | 2 | +-------------------------+---------+---------+---------+---------+---------+---------+ | Acc_Prof | 5 | 6 | 10 | 14 | 17 | 8 | +-------------------------+---------+---------+---------+---------+---------+---------+ * A few steps have been skipped to make the document shorter. If we iterate through the array Acc_Prof, we can find out the maximum profit to be 17! The pseudo-code: Procedure WeightedJobScheduling(Job) sort Job according to finish time in non-decreasing order for i -> 2 to n for j -> 1 to i-1 if Job[j].finish_time <= Job[i].start_time if Acc_Prof[j] + Profit[i] > Acc_Prof[i] Acc_Prof[i] = Acc_Prof[j] + Profit[i] GoalKicker.com – Algorithms Notes for Professionals 69 endif endif endfor endfor maxProfit = 0 for i -> 1 to n if maxProfit < Acc_Prof[i] maxProfit = Acc_Prof[i] return maxProfit The complexity of populating the Acc_Prof array is O(n2). The array traversal takes O(n). So the total complexity of this algorithm is O(n2). Now, If we want to find out which jobs were performed to get the maximum profit, we need to traverse the array in reverse order and if the Acc_Prof matches the maxProfit, we will push the name of the job in a stack and subtract Profit of that job from maxProfit. We will do this until our maxProfit > 0 or we reach the beginning point of the Acc_Prof array. The pseudo-code will look like: Procedure FindingPerformedJobs(Job, Acc_Prof, maxProfit): S = stack() for i -> n down to 0 and maxProfit > 0 if maxProfit is equal to Acc_Prof[i] S.push(Job[i].name maxProfit = maxProfit - Job[i].profit endif endfor The complexity of this procedure is: O(n). One thing to remember, if there are multiple job schedules that can give us maximum profit, we can only find one job schedule via this procedure. Section 14.3: Longest Common Subsequence If we are given with the two strings we have to find the longest common sub-sequence present in both of them. Example LCS for input Sequences “ABCDGH” and “AEDFHR” is “ADH” of length 3. LCS for input Sequences “AGGTAB” and “GXTXAYB” is “GTAB” of length 4. Implementation in Java public class LCS { public static void main(String[] args) { // TODO Auto-generated method stub String str1 = "AGGTAB"; String str2 = "GXTXAYB"; LCS obj = new LCS(); System.out.println(obj.lcs(str1, str2, str1.length(), str2.length())); System.out.println(obj.lcs2(str1, str2)); } //Recursive function public int lcs(String str1, String str2, int m, int n){ GoalKicker.com – Algorithms Notes for Professionals 70 if(m==0 || n==0) return 0; if(str1.charAt(m-1) == str2.charAt(n-1)) return 1 + lcs(str1, str2, m-1, n-1); else return Math.max(lcs(str1, str2, m-1, n), lcs(str1, str2, m, n-1)); } //Iterative function public int lcs2(String str1, String str2){ int lcs[][] = new int[str1.length()+1][str2.length()+1]; for(int i=0;i<=str1.length();i++){ for(int j=0;j<=str2.length();j++){ if(i==0 || j== 0){ lcs[i][j] = 0; } else if(str1.charAt(i-1) == str2.charAt(j-1)){ lcs[i][j] = 1 + lcs[i-1][j-1]; }else{ lcs[i][j] = Math.max(lcs[i-1][j], lcs[i][j-1]); } } } return lcs[str1.length()][str2.length()]; } } Output 4 Section 14.4: Fibonacci Number Bottom up approach for printing the nth Fibonacci number using Dynamic Programming. Recursive Tree fib(5) / \ fib(4) fib(3) / \ / \ fib(3) fib(2) fib(2) fib(1) / \ / \ / \ fib(2) fib(1) fib(1) fib(0) fib(1) fib(0) / \ fib(1) fib(0) Overlapping Sub-problems Here fib(0),fib(1) and fib(3) are the overlapping sub-problems.fib(0) is getting repeated 3 times, fib(1) is getting repeated 5 times and fib(3) is getting repeated 2 times. Implementation public int fib(int n){ int f[] = new int[n+1]; GoalKicker.com – Algorithms Notes for Professionals 71 f[0]=0;f[1]=1; for(int i=2;i<=n;i++){ f[i]=f[i-1]+f[i-2]; } return f[n]; } Time Complexity O(n) Section 14.5: Longest Common Substring Given 2 string str1 and str2 we have to find the length of the longest common substring between them. Examples Input : X = "abcdxyz", y = "xyzabcd" Output : 4 The longest common substring is "abcd" and is of length 4. Input : X = "zxabcdezy", y = "yzabcdezx" Output : 6 The longest common substring is "abcdez" and is of length 6. Implementation in Java public int getLongestCommonSubstring(String str1,String str2){ int arr[][] = new int[str2.length()+1][str1.length()+1]; int max = Integer.MIN_VALUE; for(int i=1;i<=str2.length();i++){ for(int j=1;j<=str1.length();j++){ if(str1.charAt(j-1) == str2.charAt(i-1)){ arr[i][j] = arr[i-1][j-1]+1; if(arr[i][j]>max) max = arr[i][j]; } else arr[i][j] = 0; } } return max; } Time Complexity O(m*n) GoalKicker.com – Algorithms Notes for Professionals 72 Chapter 15: Applications of Dynamic Programming The basic idea behind dynamic programming is breaking a complex problem down to several small and simple problems that are repeated. If you can identify a simple subproblem that is repeatedly calculated, odds are there is a dynamic programming approach to the problem. As this topic is titled Applications of Dynamic Programming, it will focus more on applications rather than the process of creating dynamic programming algorithms. Section 15.1: Fibonacci Numbers Fibonacci Numbers are a prime subject for dynamic programming as the traditional recursive approach makes a lot of repeated calculations. In these examples I will be using the base case of f(0) = f(1) = 1. Here is an example recursive tree for fibonacci(4), note the repeated computations: Non-Dynamic Programming O(2^n) Runtime Complexity, O(n) Stack complexity def fibonacci(n): if n < 2: return 1 return fibonacci(n-1) + fibonacci(n-2) This is the most intuitive way to write the problem. At most the stack space will be O(n) as you descend the first recursive branch making calls to fibonacci(n-1) until you hit the base case n < 2. The O(2^n) runtime complexity proof that can be seen here: Computational complexity of Fibonacci Sequence. The main point to note is that the runtime is exponential, which means the runtime for this will double for every subsequent term, fibonacci(15) will take twice as long as fibonacci(14). Memoized O(n) Runtime Complexity, O(n) Space complexity, O(n) Stack complexity memo = [] memo.append(1) # f(1) = 1 memo.append(1) # f(2) = 1 def fibonacci(n): if len(memo) > n: return memo[n] GoalKicker.com – Algorithms Notes for Professionals 73 result = fibonacci(n-1) + fibonacci(n-2) memo.append(result) # f(n) = f(n-1) + f(n-2) return result With the memoized approach we introduce an array that can be thought of as all the previous function calls. The location memo[n] is the result of the function call fibonacci(n). This allows us to trade space complexity of O(n) for a O(n) runtime as we no longer need to compute duplicate function calls. Iterative Dynamic Programming O(n) Runtime complexity, O(n) Space complexity, No recursive stack def fibonacci(n): memo = [1,1] # f(0) = 1, f(1) = 1 for i in range(2, n+1): memo.append(memo[i-1] + memo[i-2]) return memo[n] If we break the problem down into it's core elements you will notice that in order to compute fibonacci(n) we need fibonacci(n-1) and fibonacci(n-2). Also we can notice that our base case will appear at the end of that recursive tree as seen above. With this information, it now makes sense to compute the solution backwards, starting at the base cases and working upwards. Now in order to calculate fibonacci(n) we first calculate all the fibonacci numbers up to and through n. This main benefit here is that we now have eliminated the recursive stack while keeping the O(n) runtime. Unfortunately, we still have an O(n) space complexity but that can be changed as well. Advanced Iterative Dynamic Programming O(n) Runtime complexity, O(1) Space complexity, No recursive stack def fibonacci(n): memo = [1,1] # f(1) = 1, f(2) = 1 for i in range (2, n): memo[i%2] = memo[0] + memo[1] return memo[n%2] As noted above, the iterative dynamic programming approach starts from the base cases and works to the end result. The key observation to make in order to get to the space complexity to O(1) (constant) is the same observation we made for the recursive stack - we only need fibonacci(n-1) and fibonacci(n-2) to build fibonacci(n). This means that we only need to save the results for fibonacci(n-1) and fibonacci(n-2) at any point in our iteration. To store these last 2 results I use an array of size 2 and simply flip which index I am assigning to by using i % 2 which will alternate like so: 0, 1, 0, 1, 0, 1, ..., i % 2. I add both indexes of the array together because we know that addition is commutative (5 + 6 = 11 and 6 + 5 == 11). The result is then assigned to the older of the two spots (denoted by i % 2). The final result is then stored at the position n%2 Notes It is important to note that sometimes it may be best to come up with a iterative memoized solution for GoalKicker.com – Algorithms Notes for Professionals 74 functions that perform large calculations repeatedly as you will build up a cache of the answer to the function calls and subsequent calls may be O(1) if it has already been computed. GoalKicker.com – Algorithms Notes for Professionals 75