I hope I actually read through and retain something from these very interesting reads:
Complete tutorial on Python for Data Analysis: https://github.com/cuttlefishh/python-for-data-analysis/blob/master/lessons/lesson03.md
IBM THINK: https://www.ibm.com/think
What do LLMs understand: https://towardsdatascience.com/what-do-large-language-models-understand-befdb4411b77/
Gradient Descent: https://cs231n.github.io/optimization-1/
Basics of Nearest Neighbours: https://cs231n.github.io/classification/
What is a p-norm: https://planetmath.org/vectorpnorm
FLANN - Fast library for approximate nearest neighbours: https://github.com/flann-lib/flann
t-SNE: https://lvdmaaten.github.io/tsne/ - t-Distributed Stochastic Neighbor Embedding (t-SNE) is a technique for dimensionality reduction that is particularly well suited for the visualization of high-dimensional datasets. The technique can be implemented via Barnes-Hut approximations, allowing it to be applied on large real-world datasets.
Few useful things to know about ML (2012 article): https://homes.cs.washington.edu/~pedrod/papers/cacm12.pdf