About Us

Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Aenean vulputate nisl arcu, non consequat risus vulputate sed. Nulla eu sapien condimentum nisi aliquet sodales non et diam. Duis blandit nunc semper rutrum congue. Phasellus sed lacus ut odio vehicula varius. Etiam iaculis feugiat tortor ac ornare.

Stay connected

Management

4 reasons for Agile in Analytics

Working in several projects as Data Science consultant I’ve realized about the need of spreading the word about project planning in this field. This is neither an Agile apology nor an open letter criticizing project managers (PMs) that prefer other methodologies. It’s more a post…

Lifesavers

Struggling with Hive… What can I do?

If you are in a Big Data project, you may have experienced how slow is Hive to JOIN a couple of tables of few TBs (well, even GBs being honest). The first option always appears to be using PARQUET as your default storage engine and…

Lifesavers

Working in a Big Data Project using the terminal

So, you are just landing in a big data project. Everybody knows how to use HDFS except you. All the data is in such a big cluster and you don’t know how to access to it. You are not really into graphic interfaces, so you…

Stock market trend prediction

Scraping stock prices using Alpha Vantage and Google Finance

Stock price scraping can be a nightmare if the APIs you’re trying to use are not up to date. Few months ago I was looking for free sources to obtain one-min-level data. Apart of having troubles with the Yahoo Finance API (apparently non up-to-date by…

Big Data

Flattening complex XML structures into Hive tables using Spark DFs

A couple of months ago in work we faced an issue where we got XML files with nested structs in structs and arrays (with also structs in them). Normally we always face these issues in Hive. Our ETL guy ingests the XML in HDFS in…

Mauricio