Zeppelin is an interactive notebook. It lets you write code into a web page, execute it, and...
Author - Walker Rowe
Spark Decision Tree Classifier
Here we explain how to use the Decision Tree Classifier with Apache Spark ML (machine learning). We...
Using Logistic Regression, Scala, and Spark
Here we explain how to do logistic regression with Apache Spark. Logistic regression (LR) is...
SGD Linear Regression Example with Apache Spark
This article explains how to do linear regression with Apache Spark. It assumes you have some basic...
Hadoop Interview Questions
Hadoop Interview Questions (This article is part of our Hadoop Guide. Use the right-hand menu to...
Hadoop Clusters: An Introduction
Hadoop clusters 101 In talking about Hadoop clusters, first we need to define two terms: cluster...
An Introduction to Hadoop Administration
Here we explain some of the most common Hadoop administrative tasks. There are many, so we only...
An Introduction to Hive
Overview Hive is very similar to Apache Pig. What it does is let you create tables and load...
An Introduction to Hadoop Analytics
Hadoop Analytics 101 Apache Hadoop by itself does not do analytics. But it provides a platform and...
Introduction to Apache Pig
Apache Pig 101 Apache Pig, developed at Yahoo, was written to make it easier to work with Hadoop...