What's Machine Learning?

At the end of my engineering studies, I knew I wanted to work in two tech sub-fields: web development and data science.

Machine learning is a subset in the vast field of data science. 

Nowadays, the computational power, as well as the telecom and storage infrastructure, have dramatically improved: it’s not only possible to generate a lot of data, but also to collect and store it.

The problem is that we lack techniques to extract useful knowledge from this big raw amount of data, so we need new methods to analyze it and make decisions based on the newly-found facts.

That’s where machine learning comes in, with the goal to develop artificial systems able to improve their performance with experience. There are mainly two kinds of jobs we use machine learning for: supervised learning, where we classify new objects, and unsupervised learning, where data scientists output useful groupings of objects.

Machine learning can be broken down in six phases:

  • Business understanding, to understand the business objectives and requirements as to convert them into a data mining problem definition.
  • Data understanding, which is about getting data and getting familiar with it (we need to understand what’s in it: its quality, its context, and its features).
  • Data preparation, the phase where we build a dataset that we will feed to our data mining algorithms (data cleaning, data transformation, data sorting, etc.).
  • Modeling, where we let the algorithm run to create the data model that will make future business decisions.
  • Evaluation, to measure the model’s level of quality and if whether yes or no it fulfills the business’ objectives.
  • Deployment, aiming at organizing the extracted knowledge to make it understandable. The data mining process also has to be made repeatable for future use.

Personally, I’m interested in applying machine learning to text processing. That’s one of the things I try to study when I find the time, and one of the next features of Cowriters uses one thing I learned from it.