[Ubuntu] Install Hadoop 3.0.0 & Hive on Ubuntu 16.04

What’s Hadoop?

Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage.

The project includes these modules:

Hadoop Common: The common utilities that support the other Hadoop modules.
Hadoop Distributed File System (HDFS™): A distributed file system that provides high-throughput access to application data.
Hadoop YARN: A framework for job scheduling and cluster resource management.
Hadoop MapReduce: A YARN-based system for parallel processing of large data sets.
[read more…]

閱讀全文 →

[Ubuntu] Install Spark2.2.1 on Ubuntu 16.04

What is Spark?
Apache Spark is a fast and general-purpose cluster computing system. It provides high-level APIs in Java, Scala, Python and R, and an optimized engine that supports general execution graphs. It also supports a rich set of higher-level tools including Spark SQL for SQL and structured data processing, MLlib for machine learning, GraphX for graph processing, and Spark Streaming. [read more…]

閱讀全文 →

[Python] Use python boto3 and crontab to backup MySQL to AWS S3

I wrote a python scr…

閱讀全文 →

[Ubuntu] Using Amazon RDS with WordPress

In order to improve …

閱讀全文 →

[Ubuntu] Install WordPress on AWS EC2 Ubuntu 14.04

I change my blog fro…

閱讀全文 →