IMPORTANT: This documentation is no longer updated. Refer to Elastic's version policy and the latest documentation.

Preface

edit

Elasticsearch for Apache Hadoop is an ‘umbrella’ project consisting of three similar, yet independent sub-projects with their own, dedicated, section in the documentation:

elasticsearch-hadoop proper
interact with Elasticsearch from within a Hadoop environment. If you are using Map/Reduce, Hive, Pig, Apache Spark, Apache Storm, or Cascading, this project is for you. For feature requests or bugs, please open an issue in the Elasticsearch-Hadoop repository.
repository-hdfs
use HDFS as a back-end repository for doing snapshot/restore from/to Elasticsearch. For more information, refer to its home page. For feature requests or bugs, please open an issue in the Elasticsearch repository with the ":Plugin Repository HDFS" tag.
Elasticsearch on YARN
run Elasticsearch on top of YARN - see Elasticsearch on YARN. This project is in beta.

Thus, while all projects fall under the Hadoop umbrella, each is covering a certain aspect of it so please be sure to read the appropriate documentation. For general questions around any of these projects, the Elastic Discuss forum is a great place to interact with other users in the community.