At first, I thought I would begin this blog by defining, and then quickly summarizing, machine learning. After some thought and poor starts on that first post, though, I've found that the concept's so vague and broad that it defies a clear definition. Here's my best shot though, a quick and dirty attempt:
Applications of Machine Learning, or Statistical Learning (SL) as I'll refer to it in these first few posts, are generally attempts to apply statistical models to large datasets in an attempt to draw out patterns that are too complex to be represented algorithmically, and use that model to properly interpret new information.
There's a curve that I rode when I started learning about Statistical Learning: first, I had some difficulty understanding how one could possibly use a statistical model to predict something that seems far too random or subtle for direct modeling. Second, I began to hopefully suspect that SL could be used to solve almost every computational problem in the world given enough data. Third, and currently, I've realized that the world and its processes are generally far more complex than we'll be able to model, even statistically, for a very, very long time.
This blog will hopefully take the readers along that path as well, or at least chronicle my experiences along the way. Expect disillusionment, and a fairly thorough explanation of how the entire field shouldn't be called a field, can't be termed a study, but is instead an amusing collection of tricks pulled from a statistician's hat -- tricks that are incredibly useful, and in some cases far more effective than any alternative, but simple tricks nonetheless.
Anything I discuss in detail will have the clearest explanation of any background knowledge that I can provide, though I will assume a rudimentary knowledge of statistics and programming. Anywhere that ability in those two fields is necessary, I'll provide links to useful internet articles or books that have helped me.
I'll also frequently post links to papers and talks from the field, with some minor comments, though the reader is left the difficult task of getting any background necessary to read them.
In the next few posts, a vague discussion of how one can design a Statistical Learning system by self-examination, and how SL can make an amusing, though poor approximation to a master author.
0 comments:
Post a Comment