A summary of how AI has progressed in the last 5 years and current challenges (by ChatGPT)
Hey there, fellow data strugglers! It’s no secret that the field of artificial intelligence (AI) has been making massive strides over the last few years. Let’s take a quick trip down memory lane and see how things have progressed since 2017. First up, we’ve seen...
cetrulin,
2 years ago
3 min read
4 reasons for Agile in Analytics
Working in several projects as Data Science consultant I’ve realized about the need of spreading the word about project planning in this field. This is neither an Agile apology nor an open letter criticizing project managers (PMs) that prefer other methodologies. It’s more a post...
cetrulin,
6 years ago
3 min read
Struggling with Hive… What can I do?
If you are in a Big Data project, you may have experienced how slow is Hive to JOIN a couple of tables of few TBs (well, even GBs being honest). The first option always appears to be using PARQUET as your default storage engine and...
cetrulin,
7 years ago
3 min read
Working in a Big Data Project using the terminal
So, you are just landing in a big data project. Everybody knows how to use HDFS except you. All the data is in such a big cluster and you don’t know how to access to it. You are not really into graphic interfaces, so you...
cetrulin,
7 years ago
4 min read
Scraping stock prices using Alpha Vantage and Google Finance
Stock price scraping can be a nightmare if the APIs you’re trying to use are not up to date. Few months ago I was looking for free sources to obtain one-min-level data. Apart of having troubles with the Yahoo Finance API (apparently non up-to-date by...
cetrulin,
7 years ago
5 min read
Flattening complex XML structures into Hive tables using Spark DFs
A couple of months ago in work we faced an issue where we got XML files with nested structs in structs and arrays (with also structs in them). Normally we always face these issues in Hive. Our ETL guy ingests the XML in HDFS in...
cetrulin,
7 years ago
9 min read
Stay connected