avatar

KZK's blog

Lead engineer

Mental Models

title: First Principle date: 2020-08-04 00:00:00 First principles is the principles we use to build understanding on top of them. If we never learn to take something apart, test our assumptions about it and reconstruct it, we end up bound by what other people tell us, trapped in the ways things have always been done. When the environment changes. we just continue as if things were the same, making costly mistakes along the way

None

Five Why’s 🔗It’s a way to depth on what you know, and what you don’t know. trying to understand the world as a 5 year does. Asking compulsively Why every time. The idea is to get the main idea we want to deep, and ask Why is this true? and what ever the answer is, ask again: Why? If in any of the answers you get to a “cause this is how it is” or Cause yes.

None

the map of reality is not reality. not the best map. if it was so accurate it will also continue the map itself and would be as complex as reality maps are reductions of reality. to help us to understand the reality and make decisions without having to understand all the territory. most of the times we just need good enough information the truth is we can only navigate the complex reality through some abstraction.

Thought Experiment

Thought experiments allow us to dig deeper on “What if” questions allowing us to: Improve our understanding of the world, by making thinking what if… and then check how the world would change. Also tells you a lot of what you don’t know Get consciousness of our plan, for example I buy 1000 of stock, what if, starts going up 10% in a single day? should I sell? what if goes down but little by little?

None

Bank of trust: each interaction is an opportunity to gain/loss trust Mental Models

None

What is an eigenvector? Eigenvectors is a decomposition of a matrix, where we multiply the matrix by a vector, and the result is the same vector multiplied by an scalar $$A*\vec{v} = \lambda * \vec{v}$$ What is the geometric explanation of eigenvectors? it will tell you, which directions while not change when using the transformation matrix A, the eigenvectors will not change direction, they will only stretch by the factor of the eigenvalue ($$ \lambda $$)

Cosine Similarity

-What is cosine similarity? is a metric to compare two vectors depending on the inner angle, and that way, it’s not affected for the size of the vector Cosine of two vectors is: $$cos(Beta) = \frac {\vec(v) * \vec(w)} {|\vec(v)| * |\vec{w}|} $$ if the angle is orthogonal, the cos of 90 is 0, meaning if the cosine similarity is 0, means that the vectors are orthogonal, meaning that the similarity is 0.

JSII - Create libraries in TypeScript, use them everywhere!

Jsii is a toolchain that allow you to code libraries in typescript, and use them in Python, C# and Java.

Likelihood

Naive Bayes

On Naive Bayes we estimate the probability for each class by using the joint probability of the words in classes. The Naive Bayes formula is just the ratio between these two probabilities, the products of the priors and the likelihoods Why is Naive Bayes, named naive? Cause it makes the assumption that features used for classification are independent Algorithm: Get the frequency of each word in each class freq(word, class) Also get the total number of words in each class countWords(class) Now we can calculate P(class | word appeared) as $$\frac {freq(word, class)} {countWords(class)}$$ Now to infer what is the probability of a sentence being of a class or another we can $$\frac {P(positive)}{P(negative} \prod_{i}^{m} \frac {P(w_i | positive)} {P(w_i | negative )}$$ If the values is > 1, meaning that overall that the sentence is positive prio is $$\frac {P(positive)}{P(negative}$$ likelihood is $$\prod_{i}^{m} \frac {P(w_i | positive)} {P(w_i | negative )}$$ Log Likelihood Why we use Log Likelihood for numeric stability, preventing underflow $$log(\frac {P(positive)}{P(negative} \prod_{i}^{m} \frac {P(w_i | positive)} {P(w_i | negative )})$$ how you can decompose $$log(a * b)$$ $$log(a) + log(b)$$ We can rewrite naive bayes using log likelihood as… $$log(\frac {P(positive)}{P(negative}) + log(\prod_{i}^{m} \frac {P(w_i | positive)} {P(w_i | negative )})$$, Since now is the logarithm a sentence will be positive if log likelihood is > 0 Assumptions Naive Bayes assumes that features are independent, in NLP specifically means that there is no overlap of meaning in the words, which is not true.