I was never a fan of diving into the math needed for understanding algorithms, until I found myself in the hole with if/else loops and horrible quadratic growth and complexity. To get my Data Engineering knowledge and skills to the next level I had to face up my math and dive into this book.
As a side note: Never skip out on the harder stuff in school, or in life in general, it will come back to bite you.
My knowledge before this book:
- I understand that multiple loops can grow exponentially
- I understand the basic sorting, like quicksort and binary tree searching
- I can grasp some of the more advanced concepts but not enough to immediately recognize or implement them
How can I test myself?
For my graduation project I wrote a Django webservice/API to get real and generated shipping prices. To get those prices I needed to retrieve a lot of different classes from a database. This was done a few years ago with a lot less knowledge then I now have. To check if I learned a lot from this book I will go over the code again to check what I can improve.
I tried to build a program that can create workouts based on the RTS principles. This program ran into some infinite loops and a lot is done via basic looping so that can be improved quite a lot. To check if I learned a lot from this book I will go over the code again to check what I can improve.
So this might not be just a review of the book but also a recap for myself on the basic of algorithms. The second which I desperately need as I shamelessly failed the first exercise(Check if 2 strings are anagrams) by thinking too simple. I checked if all letters from a are in b forgetting that b could also contain extra letters and my code would flag it as a anagram.
What is an algorithm? From the book
An algorithm is a procedure , consisting of a finite set of steps that solves a given problem in a finite time.
Finite ofcourse because it has to end some time else you would have an infinite loop and would have to wait for eternity.
Another big thing to remember is running time of the algorithms, also known as the Big O notation. O(g) means that the set of functions(the algorithm) does not grow faster than g, which would be a constant scale. If you have 100 items this does 100 units of work, for 200 items it’s 200 units and so forth.
Here is a table for all the variations
|Notation|Name|Example| |O(1) | Constant list.index() |O(log n) | Logarithmic |binary search |O(n) | Linear |sequential search |O(n log n) | LogLinear |quicksort |O(n^2) | Quadratic |insertion sort |O(n^3) | Cubic |matrix multiplication |O(k^n) | Exponential |travelling salesman |O(n!) | Factorial |bogosort
The book goes deeper into Python data types, mainly lists and derivatives such as: set, linked lists, stacks and queues. I worked with those before but never dove deeper into how they work and why you use certain types over others.
What I learned is using sets to check membership is a lot faster then checking if an item exists in a list. That is because a set uses hashes and items can only exist once, so it does a linear lookup. Where as a list can contain the same item twice and every item has to be checked because of this.
After surviving the second chapter I quickly learned that the math in the book is a bit too much for me right now. So I will save it for later on this year where I will have the knowledge to fully grasp the concepts in the book.
However while searching for more information on certain context in the book I stumbled upon an article which referred to Impractical Python as a way to learn more about data types. So I took a look at that as well.